Closed susan-garry closed 1 year ago
Wahoo!!!! Super nice work getting the parser going, @susan-garry! And figuring out the trickiness with Pest along the way!
There is one small clerical set of things we'll want to clean up before merging: some build targets got added to git here. Let's remove these things and add some of them to a .gitignore
somewhere:
pollen/target/
, which is Rust's build output directory.DS_Store
.rlib
files (those are Rust libraries)And here are some thoughts about strategy for the future, all of which deserve to happen in subsequent PRs instead of this one:
let x: Edge = ...
instead of Edge x = ...
. Again, purely a taste thing, but I think this is nicer and ends up easier to parse too!Ah, whoops! I have updates the .gitignore
file and removed the extraneous files - the 1400 modified files probably should have tipped me off that something was amiss.
About our strategy for the future:
int i = 1;
, let i: int = 1;
, and i: int = 1;
. If we're trying to make the language accessible to non-CS folks, then the let
keyword may be a bit alien. I'm not aware of any language that picks the third option (from what I can tell based on skimming the docs, python uses type hints in function definitions but not for variable declarations?) but perhaps it's a little more comprehensible than the first and a little more accessible than the second. Thoughts? (I know that one concern we might have is whether anyone will adopt the language and use it in practice, but if we're the ones maintaining it then we should make sure it's something we're motivated to work on).crush
. To get something like node depth
or node degree
working, however, I think we will need to support either tuples or arrays.python uses type hints in function definitions but not for variable declarations?
It also has type hints on variable declarations! So this is in fact valid Python:
i: int = 1
My general take on this is: pick any keyword you want (let
, var
, decl
, def
, const
, local
, nothing), but the type-after-the-identifier style is the way of the future. C & Java are old and do it the old way; Rust, Go, Swift, TypeScript, typed Python, Scala, and Kotlin are all new and learned from the mistakes of the past and do it the new way. A big advantage is that it lends itself well to adding type inference in the future, i.e., eventually supporting let x = 5
instead of let x: int = 5
if the compiler can deduce that for you without either a disruptive change to the syntax or an annoying non-type type name like C++ auto
.
I think function calls, record declarations, record field access, and emit statements
Certainly emit statements! That's important for producing any output. Makes sense.
Not sure about the others though: when you get a chance, maybe you could elaborate somewhere on how function and records arise? I can imagine functions not mattering much for depth, for instance.
That makes a lot of sense to me - I will go with i: int = 1;
unless someone has strong feelings about using a keyword like let
.
Record access definitely needs to be supported because we are representing pangenomic graphs essentially as a bunch of records - if we want to compute node depth
, we need to call node.steps.
I cannot, off the top of my head, think of a meaningful graph query that doesn't involve this.
Aside from this, we have two basic types of output - mappings of nodes to data, and modified graphs. We will probably wants to support at least one of these in the basic iteration of pollen. To support mappings of nodes to data, we probably need to support either arrays, tuples, or record definition and initialization (though not all three). To support the outputting of new graphs, we need record initialization (though not record definition), since we represent graphs using records.
Since record field access is essential either way, and we can support both types of output through records, that seems like the most logical choice for what to support beyond the emit
statement.
Ah, whoops! I have updates the
.gitignore
file and removed the extraneous files - the 1400 modified files probably should have tipped me off that something was amiss.
Random note that gitignore.io can be helpful for initial setup esp with regard to temp files.
Also might be worth noting that git exclude also exists which is basically just gitignore but only local (not committed). Useful if there are some files you want ignored that are specific to your dev flow. For example I have a local_examples
folder where I put calyx files I'm trying to debug or mess with and having it in the git exclude saves me from having to side step those files when adding things to a commit.
Cool cool; the argument for "dot syntax" to get things like node.steps
makes sense to me!!
I've verified that make test-slow-odgi
and make test-slow-flip
work as expected, so I will go ahead and merge what we have so far!
The beginning of a bona-fide parser for pollen! This lays out some basic infrastructure and implements parsing for a few basic features.
What it can do:
int i; int i2 = e
What it can't do/could do better:
cargo run test/test1.txt
. In the near future, it would be great to automatically run the parser on all files under thetest
folder, and add tests for files that should not parse correctly, and instead throw an error.Parse failed: Error { variant: ParsingError { positives: [add, sub, mult, div, modulo, geq, leq, lt, gt, eq, neq, and, or], negatives: [] }, location: Pos(1219), line_col: Pos((77, 25)), path: None, line: "int int1 = [3 * (2 + 4)]", continued_line: None }
tells me that I have forgotten a semicolon). It would be great to find a way to make these a bit more readable.