github-linguist / linguist

Language Savant. If your repository's language is being reported incorrectly, send us a pull request!
MIT License
12.15k stars 4.21k forks source link

Remove backreferences in regex patterns for `XML Property List` and `JavaScript` #6897

Closed DecimalTurn closed 1 month ago

DecimalTurn commented 3 months ago

There is currently 2 discrepencies between Linguist and Enry that can be explained by the fact that RE2 doesn't support backreference:

From: https://github.com/go-enry/go-enry#divergences-from-linguist

In both cases, this is because the regex pattern is trying to make sure that the quotation mark used is the same to start and end the string. That's something important when performing syntax highlighting, but that doesn't provide much value in terms of distinguishing one language from another.

For that reason, the proposed changes won't make a noticable difference from Linguist's side, but it will remove the presence of backreferences for Enry.