TIny-Hacker / language-ti-basic

VS Code language support for (e)Z80 TI-BASIC. Also used by github-linguist
BSD 3-Clause "New" or "Revised" License
14 stars 1 forks source link

Use TI-Toolkit standard token representations #5

Closed rpitasky closed 6 months ago

rpitasky commented 6 months ago

Heya, As you know, we're trying to standardize textual token representations; TI-Toolkit strongly advocates for new tools to use the accessible token representations provided in https://github.com/TI-Toolkit/tokens/blob/main/8X.xml while we work behind-the-scenes on other community projects to integrate them.

It's currently unclear what token representations you're using- this is partially a call-to-action to use these new sheets and a request for information on what token representations you've chosen. I was pleasantly surprised to see this made it into linguist, congrats and great work!

~ iPhoenix

TIny-Hacker commented 6 months ago

The current highlighting is mainly based on SourceCoder, though I've been working on implementing importing / exporting using tivars_lib_cpp and emscripten which I think should use those tokens. I'm not sure exactly where to go from there - to only support the accessible token representations or try to support SourceCoder as well for backwards compatibility with pre-existing source files on GitHub and SourceCoder. What do you think makes the most sense?

rpitasky commented 6 months ago

I don't believe the current tivars_lib_cpp supports the updated tokens sheets yet, that's a question for @Adriweb

The tokens sheet is mostly backward-compatible with SC3 sheets- and completely backward-compatible for the tokens that people actually use- but considering I hope to get SC3 updated to use these tokens, this will eventually be a moot point. The SC3 sheets are based on an old version of the TokenIDE sheets, and the TI-Toolkit sheets document all extant formats and some new research. Unfortunately, a grammar integrating all the extant formats would be wildly ambiguous; we made an informed choice for the accessible field. There is one note to be made about things like Gray/Grey, though, of which only one is chosen for the accessible field.

It would not be too difficult to only include alternatives that would not ambiguate the grammar, and such a thing would be more general than any of the other options.

I'm firmly of the opinion that the syntax for writing TI-BASIC as plain text should be opinionated and standardized; it's a mess right now and it is difficult to write tools supporting such a mess.

adriweb commented 6 months ago

yeah the C++ lib doesn't yet use the new tokens data, it's planned though.

(my opinion about (de)tok would be to use Unicode everywhere but 👀)

TIny-Hacker commented 6 months ago

Alright, it sounds like the best thing to do is for me to work on migrating to the 8X xml file, and then hold off on finishing import / export until the C++ lib uses the new token data.

TIny-Hacker commented 6 months ago

I've updated the syntax to now use the accessible formats specified in 8X.xml in a99235c.