Open Bystroushaak opened 4 years ago
I mean I can take care of this if you want, but I am not sure whether I want to be a package maintainer yet.
Hi @Bystroushaak! I started this package but didn't have the bandwidth to maintain this alongside the TypeScript one and others. If you'd like to submit to PyPI that would be great!
I've been looking into the code and I am not entirely sure whats the idea here.
I though it should be just a codec for decoding indented structures and mapping them to native python structures, something like json or yaml module in python, but when I wanted to cleanup and fix the code, I saw all kinds of parsers built on top of that (CSV, JSON, TSV, ..), and I've become unsure what you really want here. Also you have some kind of protocol / api that I don't understand and really strong in-memory tree representation, like if you didn't want to map to the native datastructures, but use your format.
Can you please tell me more about the idea you have for pytree?
I've been looking into the code and I am not entirely sure whats the idea here.
When you write "the code" are you talking about this pytree
codebase, or the Tree Notation code in general, such as projects like https://github.com/treenotation/jtree
? The pytree
codebase is in a very early form. I started it after wanting to use Tree Notation for part of a deep learning project I was doing with PyTorch, and then a couple of folks contributed to it, but it hasn't yet implemented much of the comparable JTree TypeScript library. The latter I use in many places daily. I think in both cases it would be fair to say the idea isn't crystal clear :), but just wanted to be sure I understand what code you are talking about.
I though it should be just a codec for decoding indented structures and mapping them to native python structures, something like json or yaml module in python, but when I wanted to cleanup and fix the code, I saw all kinds of parsers built on top of that (CSV, JSON, TSV, ..), and I've become unsure what you really want here.
This is a good point. At the moment I've separated things into 2 levels of abstraction: Tree Notation and Tree Languages. The Tree Notation is a very simple, permissive, general syntax. It defines nodes (lines separated by new lines), words (strings separated by spaces), and scopes (aka parent-child nodes aka edges; created via indentation levels). So the base notation is not super interesting. Some properties of it are neat, such as this one: tree(string).toString() === string ∀ strings
. But the structures that these map into are very basic (lists/strings/arrays). As you said there are methods in there to parse/emit formats like CSV
and JSON
, but those methods only make sense for a subset of Tree documents. I've found them handy to have in some cases, but I can see how it would be confusing as there is not direct isomorphisms between those structures and Tree Notation. I got similar feedback about the toJson
method in the TypeScript library and so renamed that one to toJsonSubset
a while back, to more clearly indicate that there is not a 1-to-1 correspondence with JSON.
You can start to have direct isomorphisms when you get to the Tree Language level. Tree Languages have the same basic structures as Tree Notation, but can be parsed into any native structures. For example, here is a demo tree language that can parse directly into/from JSON. And one that is sort of a typed CSV (well, SSV to be specific).
Also you have some kind of protocol / api that I don't understand and really strong in-memory tree representation, like if you didn't want to map to the native datastructures, but use your format.
I'm not 100% sure I understand this. Could you explain a bit more?
Can you please tell me more about the idea you have for pytree?
Well, there are two basic reasons:
1) Not too frequently, but maybe 10-20 times a year, someone asks me if there is a Python implementation of Tree Notation. 2) At one point in 2018 or 2019 I myself really wanted a Python implementation of Tree Notation for a Python project I was working on at the time. I was doing hyperparameter experiments and running jobs on many machines and just wanted to define my own simple config lang and data logging lang to manage everything.
But more generally I'm still exploring whether or not Tree Notation
is a useful idea—whether it could beneficially replace XML/JSON/YAML/CSV/DSLs in many cases, and so building implementations in other languages has been a useful class of experiments to run.
At the moment I am back mostly in TypeScript land so don't have an urgent need for PyTree, but I still think it could be worthwhile.
What brought you to it? What would your idea be for using Tree Notation in Python?
Can you please actually upload the project on pypi? Also there is already project called
pyTree
, so that won't probably work and will require renaming to something liketreenotation
.