tlaplus / rfcs

RFCs for changes to the TLA+ specification language
MIT License
13 stars 0 forks source link

Publish fixed version of TLA+ 2 grammar spec #7

Closed ahelwer closed 1 month ago

ahelwer commented 1 year ago

On the TLA+ Version 2 webpage you can download a TLA+ 2 BNF grammar, appropriately written in TLA+ itself. However, I found a number of bugs in this grammar as I was developing the tree-sitter grammar. This resulted in much painstaking testing of SANY & examination of the JavaCC grammar to see what actually counts as valid syntax. I have finally taken the effort to backport my findings to the TLA+ 2 BNF grammar itself. If accepted, this updated spec should be provided for download instead of the current one. This will be a great boon to any future writers of TLA+ tooling, as the month or so I spent figuring out the "true" language spec was by far the least pleasant part of writing the tree-sitter grammar.

You can see the updated spec here, with comments added on each change (as requested by Leslie): https://github.com/tlaplus-community/tlaplus-standard/blob/main/grammar/TLAPlus2Grammar.tla

You can see the actual changes in this PR: https://github.com/tlaplus-community/tlaplus-standard/pull/1

I received signoff on these changes from Leslie Lamport via email. He suggested also putting it through the foundation RFC process.

xxyzzn commented 1 year ago

Thanks for doing this Andrew. I'm sorry that it was so difficult. The decision to use JavaCC does not seem to have been a good one. In the early days, I didn't pay enough attention to how the tools were being implemented. (And speaking of that, one of my errors was not insisting on an incremental parser. Have you considered writing one?)

Cheers,

Leslie

From: Andrew Helwer @.> Sent: Monday, February 13, 2023 12:13 To: tlaplus/rfcs @.> Cc: Subscribed @.***> Subject: [tlaplus/rfcs] Publish fixed version of TLA+ 2 grammar spec (Issue #7)

On the TLA+ Version 2https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Flamport.azurewebsites.net%2Ftla%2Ftla2.html&data=05%7C01%7Clamport%40microsoft.com%7Cf7eca701986b4086813108db0dfec09f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638119160030748363%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dE6pZsJ%2Bh5MUOOK0i99Ka%2B6Bp16cmlkz%2BxwwYJdgoMg%3D&reserved=0 webpage you can download a TLA+ 2 BNF grammar, appropriately written in TLA+ itself. However, I found a number of bugs in this grammar as I was developing the tree-sitter grammar. This resulted in much painstaking testing of SANY & examination of the JavaCC grammar to see what actually counts as valid syntax. I have finally taken the effort to backport my findings to the TLA+ 2 BNF grammar itself. If accepted, this updated spec should be provided for download instead of the current one. This will be a great boon to any future writers of TLA+ tooling, as the month or so I spent figuring out the "true" language spec was by far the least pleasant part of writing the tree-sitter grammar.

You can see the updated spec here, with comments added on each change (as requested by Leslie): https://github.com/tlaplus-community/tlaplus-standard/blob/main/grammar/TLAPlus2Grammar.tlahttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftlaplus-community%2Ftlaplus-standard%2Fblob%2Fmain%2Fgrammar%2FTLAPlus2Grammar.tla&data=05%7C01%7Clamport%40microsoft.com%7Cf7eca701986b4086813108db0dfec09f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638119160030748363%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=XA0teUC4F8WJ2rzQWObTKgd3kZ99h811T4OLyGG1MYU%3D&reserved=0

You can see the actual changes in this PR: tlaplus-community/tlaplus-standard#1https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftlaplus-community%2Ftlaplus-standard%2Fpull%2F1&data=05%7C01%7Clamport%40microsoft.com%7Cf7eca701986b4086813108db0dfec09f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638119160030748363%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a88uEQyRLDsYxOCYRBJ7%2FutzFYIKq0w85poTiUGY5%2Bo%3D&reserved=0

I received signoff on these changes from Leslie Lamport via email. He suggested also putting it through the foundation RFC process.

- Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftlaplus%2Frfcs%2Fissues%2F7&data=05%7C01%7Clamport%40microsoft.com%7Cf7eca701986b4086813108db0dfec09f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638119160030748363%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6jBEujGSLRQdmjck9iuqrZEPsIOsg%2FyyU0zyXn%2BCfiQ%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAEUS3OUAILYGLEDJZ7WPWKDWXKIWBANCNFSM6AAAAAAU2WRTY4&data=05%7C01%7Clamport%40microsoft.com%7Cf7eca701986b4086813108db0dfec09f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638119160030904581%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=J1PKraO35Oy4GN6oqe3M5BL3YhfisjsO8DyYfRBTErA%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

ahelwer commented 1 year ago

@xxyzzn I do indeed want to write a new TLA+ parser that has incremental & error-tolerant parsing, and also multiple entry points (so someone can parse just an expression, or a unit definition, or an entire file) in addition to exposing lots of language feature queries to ease creation of a language server. I'm still running in self-funded mode so it's a bit of a stretch though. Hopefully I can get some contracts to save up money for the time required. I would probably write it in Rust, which has minimal runtime requirements & bindings in lots of languages including Java so the existing tools could be adapted to run on top of it.

ahelwer commented 1 month ago

This has been completed.