Open vzarytovskii opened 1 year ago
Helix also uses it; exclusively .
@vzarytovskii Do you have any thoughts about how that would work with the lex filter? Perhaps, we could look at the Python implementation for the inspiration, as it also has whitespace-sensitive syntax.
@vzarytovskii Do you have any thoughts about how wold that work with the lex filter? Perhaps, we could look at the Python implementation for the inspiration, as it also have whitespace-sensitive syntax.
Yeah, no specific ideas just yet, probably should figure it out when we'll start working on it.
How can someone help to get this started? Interested in contributing
@Eliemer These documents has some context on how to proceed https://tree-sitter.github.io/tree-sitter/creating-parsers
I am aware of this grammar, but if you look at the README, you'll see that it does not cover all language features and whitespace-sensitive aspect.
Generating it from fslexyacc files and lexfilter (if possible of course) has a benefit of having it always up to date when we are updating it with new features.
On my endeavour to find an ANTLR grammar for F#, I discovered a few things, who might be interesting. First, there are a gazillion similar formats, obviously. 😊
So, I digged deep into this ecosystem and there are all sorts of compiler in every direction, some are more maintained than others.
As an example, I discovered an EBNF <--> Treesitter compiler .
And there is a similar project, that goes only from Treesitter to EBNF, and it shows an already a generated EBNF file for OCaml:
https://github.com/mingodad/plgh/blob/main/tree-sitter-ocaml.ebnf
So, what's obvious, I think, is that EBNF is a considerably easier format, I think.
So, at that point it seems that editing the existing EBNF of OCaml and than translating it to Treesitter might be an option. 🤷🏻♂️
I dont know, how it compares to generating from Yacc and Lex 🙈
I also found a couple of other, very interesting projects, and they would help to generate an ANTLR file, that I strife to create for OneDev.
So if going the route from EBNF to Treesitter sounds acceptable, would this provide a path for both, Antlr and Treesitter.
P.S:
And if that all doesn't help, I also stumbled across a couple of articles, who might help to implement treesitter directly, and understand its format.
https://derek.stride.host/posts/comprehensive-introduction-to-tree-sitter
https://gist.github.com/Aerijo/df27228d70c633e088b0591b8857eeef
Ocaml syntax does not account for whitespace sensitivity (i.e. lexfilter), so won't be much helpful here unfortunately. I think, if we don't want to straight up generate it, but write a grammar manually first, we should be looking one for python.
Yeah, I actually considered another way now.
Going from .fsy to EBNF and then to Treesitter.
This doesn't involve OCaml at all. I will try to get this running soonish.
Yeah, I actually considered another way now.
Going from .fsy to EBNF and then to Treesitter.
This doesn't involve OCaml at all. I will try to get this running soonish.
Fsy to ebnf won't likely work to, it won't be covering whitespace sensitivity
If anyone is interested I’ve been slowly working on a F# treesitter grammar that supports indentation-based scoping
If anyone is interested I’ve been slowly working on a F# treesitter grammar that supports indentation-based scoping
Nice
If anyone is interested I’ve been slowly working on a F# treesitter grammar that supports indentation-based scoping
I would like to help with testing and improving it. @Nsidorenco do you have any to-do things in mind (or are ones in README up to date)? I can start using it in my day-to-day work with compiler and maybe also start fixing things.
Yeah, I actually considered another way now. Going from .fsy to EBNF and then to Treesitter. This doesn't involve OCaml at all. I will try to get this running soonish.
Fsy to ebnf won't likely work to, it won't be covering whitespace sensitivity
How is whitespace significance breaking either of the protocols?
Or do you think its lost in the translation?
Yeah, I actually considered another way now. Going from .fsy to EBNF and then to Treesitter. This doesn't involve OCaml at all. I will try to get this running soonish.
Fsy to ebnf won't likely work to, it won't be covering whitespace sensitivity
How is whitespace significance breaking either of the protocols?
Or do you think its lost in the translation?
Yeah, I think there's a possibility of losing a bunch of info during conversions. Besides fslexyacc alone doesn't carry the indent/whitespace info.
Yeah, I will see.
Considering Python is popular, do I guess this info is not being lost. The Yacc > EBNF converter is not updated since 2 years, the EBNF to Treesitter converter is very well maintained.
Besides fslexyacc alone doesn't carry the indent/whitespace info.
What else does?
Chet told me, the files are at the compiler repo:
https://github.com/dotnet/fsharp/blob/main/src/Compiler/pars.fsy https://github.com/dotnet/fsharp/blob/main/src/Compiler/pppars.fsy
@vzarytovskii any help is much welcomed. the README is relatively up-to-date. Off the top of my head the biggest remaining parts are 1) testing 2) improve the precedence of rules (to reduce parser size) 3) adding missing language features, like annotations 4) improve the external scanner to open a new ident scope on brackets and braces
What else does?
lexfilter in the repo
Yeah, I already found your previous comment on Discord about that, many thanks. I think Nsidorenco is already very far, so generating seems to serve no purpose at this point.
@Nsidorenco I am testing it with Helix, but I am unsure why it currently fails. So I cant provide you any meaningful feedback as of now, and hope I can do so in the future.
Thanks a lot for developing this, you`re great 🥳
The easiest way to be testing it, subjectively, is with nvim-treesitter and nvim-treesitter/playground, it has a great way of visualizing the tree (probably prim-types.fs is an overkill of a test, since FSharp.Core is a bit special):
Is your feature request related to a problem? Please describe.
Currently, more and more tooling and editors are relying on treesitter for navigation, parsing and semantic highlighting (e.g. in-browser VScode, nvim, github,), we should provide TS grammar for F#.
Describe the solution you'd like
TS grammar should be (if possible) generated from our fsl/fsy and hosted in the repo.
Links Treesitter docs: https://tree-sitter.github.io/tree-sitter/ Existing grammars, incl. some ws-sensitive: OCaml: https://github.com/tree-sitter/tree-sitter-ocaml Python: https://github.com/tree-sitter/tree-sitter-python Yaml: https://github.com/ikatyang/tree-sitter-yaml Haskell: https://github.com/tree-sitter/tree-sitter-haskell