dotnet / csharplang

The official repo for the design of the C# programming language
11.32k stars 1.02k forks source link

Add Grammars To Documentation #2640

Closed niemyjski closed 5 years ago

niemyjski commented 5 years ago

Would it be possible to add the language grammars for C# and VB.net to the available documentation?

References https://github.com/dotnet/roslyn/issues/3169 as it should have never been closed. We need to get the grammars updated this is unacceptable to those third parties who are trying to improve the dev experience.

CyrusNajmabadi commented 5 years ago

Well, i had some fun taking an hour to actually try this out. For a grammar generated from the C# syntactic model, see:

https://gist.github.com/CyrusNajmabadi/412c3209d1ce97236420218498e7c8d4

It did take just about an hour. 20 min to do the initial impl, then about 40 min to just clean things up :D Note: i'm not an antlr expert. So if i got any of the antlr syntax wrong let me know. Also, this is just the syntactic side. There would have to be a corresponding lexical side of things. You can reference the existing lexical spec and then just map anything that ends in Token here to that.

YairHalberstadt commented 5 years ago

That's really useful @cyrusnajmabadi

Do you want to add it to the syntax generator, so it automatically updates a doc somewhere?

CyrusNajmabadi commented 5 years ago

@YairHalberstadt I'd definitely like to. However, there are a couple of small changes i want to make to the syntax file first to make things a little more pleasant. Specifically, i'd like to at least do:

  1. Have lists be able to say if they expect at least one element. That would be nicer for being able to generate + in the grammar file, instead of * and x (',' x)* instead of (x (',' x)*)?
  2. Have a way to be able to say that a series of fields is 'choose one'. i.e. With using_statement, you really have 'using' '(' (variable_declaration | expression) ')' not really 'using' '(' variable_declaration? expression? ')'

If i can get that in, then i think this grammar goes more in line with what is being asked for. And it specifically addresses concerns like in https://github.com/dotnet/csharplang/issues/2640#issuecomment-517410929.

Then, I can see if this can just be something that is run automatically as well. I don't mind having a separate grammar file that is kept completely and automatically in sync with the actual syntax model file we keep.

billhenn commented 5 years ago

Hi @CyrusNajmabadi, that would definitely be very helpful, especially if #2 can be completed. It's much easier to read than the pure XML file and it would be fantastic if this kind of thing would be kept in sync with the XML file.

Is there any chance you could generate the output for the VB language syntax.xml file as well as part of this effort?

Thanks again for the work on this.

Korporal commented 5 years ago

That is a nice job you did @CyrusNajmabadi

niemyjski commented 5 years ago

Is there any chance you could leave the exact steps you used to generate this as well. So maybe the team can automate this :) Thanks -Blake Niemyjski

On Fri, Aug 2, 2019 at 8:10 AM Hugh Gleaves notifications@github.com wrote:

That is a nice job you did @CyrusNajmabadi https://github.com/CyrusNajmabadi

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dotnet/csharplang/issues/2640?email_source=notifications&email_token=AAHZFI6YI5IQL7VSY7NPKZDQCQW4FA5CNFSM4H7JKO4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3NWIVQ#issuecomment-517694550, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHZFI45L3NIKYFANFWSLXLQCQW4FANCNFSM4H7JKO4A .

CyrusNajmabadi commented 5 years ago

Is there any chance you could generate the output for the VB language syntax.xml file as well as part of this effort?

I probably wouldn't do it myself. But if you're interested, you could contribute such a thing yourself if my work is able to go in :-)

CyrusNajmabadi commented 5 years ago

Hi @CyrusNajmabadi, that would definitely be very helpful, especially if #2 can be completed. It's much easier to read than the pure XML file and it would be fantastic if this kind of thing would be kept in sync with the XML file.

I have created all three PRs i think are important before requesting this tool be added to roslyn.

Hopefully these can get in soon. With them, the grammar currently looks like: https://gist.github.com/CyrusNajmabadi/412c3209d1ce97236420218498e7c8d4

@BillHenning the issue you raised in https://github.com/dotnet/csharplang/issues/2640#issuecomment-517410929 is resolved to my satisfaction and comes out like so:

local_function_statement
  : modifier* type identifier_token type_parameter_list? parameter_list type_parameter_constraint_clause* (block | (arrow_expression_clause ';'))
  ;

I think that's sufficiently clear and more than acceptable for downstream language enthusiasts to understand what's going on.

billhenn commented 5 years ago

Thanks @CyrusNajmabadi, that will help.

canton7 commented 5 years ago

@Korporal ... Are you doxing him? That doesn't seem like a very nice thing to do.

Korporal commented 5 years ago

... Are you doxing him? That doesn't seem like a very nice thing to do.

@canton7

I've never heard the term "doxing" but if it could be misconstrued I adjusted the post, I am curious about the product launch he mentioned too.

canton7 commented 5 years ago

From the Microsoft Code of Conduct;

Be respectful: We are a world-wide community of professionals, and we conduct ourselves professionally. Disagreement is no excuse for poor behavior and poor manners. Disrespectful and unacceptable behavior includes, but is not limited to: ...

  • Posting, or threatening to post, people’s personally identifying information (“doxing”).

Whether anyone is getting married is personal information which has no relevance to anything here. There is no good reason to dig it out and post it publicly, especially with a link. That sort of behavior is a bannable offence in most online communities, with good reason.

EDIT - I see that the fact that there was wedding itself was in the public domain, but I still maintain that posting a link to the wedding website was not acceptable.

CyrusNajmabadi commented 5 years ago

@canton7 I think it's ok. I did mention above that i was launching a product and planning a wedding ;-). So i don't think it's totally unreasonable for someone to show interest in things i put forth myself. That said, as I've mentioned a few times in a few threads, general dicussions outside of teh scope of Roslyn or csharplang should likely be taken to gitter.im or discord. I only dropped those bits of information here to help inform why I couldn't spare more time on this. That's the limit to how much i would talk about those things here.

jnm2 commented 5 years ago

I'm glad this situation is resolved. For future readers, it's not okay to post even a hair more personally identifying information about someone else than they give explicit permission for.

mattwar commented 5 years ago

Unfortunately, the syntax model is much freer form than the syntax grammar itself. It allows for constructions that are not syntactically legal (but occur often while typing imperfect code) that are later caught during semantic analysis. So you are probably not going to get a correct C# grammar from reverse engineering the syntax model.

CyrusNajmabadi commented 5 years ago

So you are probably not going to get a correct C# grammar from reverse engineering the syntax model.

Sure. But that's always the way. Even the tighter C# grammars out there let a lot of stuff slip through that isn't legal from a purely syntactic perspective (i.e. you can tell there's a problem directly from looking only at the tree, without having to do any understanding of what any nodes actually mean).

There's always a balance between how much do you want to put into your grammar, versus how much you're willing to have handled through normative rules that are then processed elsewhere.

CyrusNajmabadi commented 5 years ago

Grammar files have been added in https://github.com/dotnet/roslyn/pull/37840 and https://github.com/dotnet/roslyn/pull/37968. As Roslyn changes and adds support for new language features, these files will be automatically kept up to date as part of our normal build process. Once this makes it to master and our other feature branches, you'll also see this accurately showing the supported grammar per branch.

You can find the files currently at:

https://github.com/dotnet/roslyn/blob/release/dev16.4-preview1/src/Compilers/CSharp/Portable/Generated/CSharp.Generated.g4 and https://github.com/dotnet/roslyn/blob/release/dev16.4-preview1/src/Compilers/VisualBasic/Portable/Generated/VisualBasic.Grammar.g4

In the future, once everything merges, the canonical version will be at:

https://github.com/dotnet/roslyn/tree/master/src/Compilers/CSharp/Portable/Generated/CSharp.Generated.g4 and https://github.com/dotnet/roslyn/tree/master/src/Compilers/VisualBasic/Portable/Generated/VisualBasic.Grammar.g4

Thanks for all the feedback. I hope you find these useful!

CyrusNajmabadi commented 5 years ago

@jcouv Your call on what to do with this issue. Feel free to close out if you think its been sufficiently addressed. Thanks!

jcouv commented 5 years ago

I think so. Thanks @CyrusNajmabadi