github-linguist / linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
MIT License
12.11k stars 4.2k forks source link

Alex and Happy #5822

Open pnotequalnp opened 2 years ago

pnotequalnp commented 2 years ago

Previous discussions: #2446 #4302

Alex and Happy are lexer and parser generators (respectively) for Haskell, based on Lex and Yacc for C. They also share extensions with Lex and Yacc, and as such are misclassified as them currently:

~200 Alex files misclassified as Lex ~1250 Happy files misclassified as Yacc

Additionally, #4952 introduced an Alex file as a sample file for Lex, which is incorrect.

Below are the templates for both Alex and Happy. The reason this is an issue and not a PR is that I'm not aware of any TextMate grammars for either of them. Currently however they are using Lex and Yacc's grammars, which result in nonsense highlighting. If they can be added without having grammars I can open a PR for them.


Language name

Alex

URL of example repository

https://github.com/haskell/alex/blob/master/src/Scan.x

Most popular extensions

Detected language

Language name

Happy

URL of example repository

https://github.com/haskell/alex/blob/master/src/Parser.y

Most popular extensions

Detected language

Alhadis commented 2 years ago

The reason this is an issue and not a PR is that I'm not aware of any TextMate grammars for either of them.

I can write one for you, assuming there's a comprehensive/authoritative reference on the parser formats. Do you know of one?

pnotequalnp commented 2 years ago

I can write one for you, assuming there's a comprehensive/authoritative reference on the parser formats. Do you know of one?

Unfortunately documentation on both of them is a bit of a pain point which is in the process of being addressed. Both are written with Happy, which is mostly just BNF. Alex's parser and Happy's parser. Those are the only really authoritative sources I'm aware of. The documentation for each also covers the grammar pretty explicitly but it's pretty sparsely distributed within a tutorial on how to use them.