jskinner / DefaultPackages

Old bug tracker for Sublime Text's "Default Packages", e.g. bad syntax highlighting
http://web.archive.org/web/20150524043750/https://www.sublimetext.com/forum/viewtopic.php?f=2&t=12095
26 stars 2 forks source link

PHP UTF-8 Variable Names & Syntax Highlighting #89

Closed Dygear closed 8 years ago

Dygear commented 9 years ago

OS and OS version: Windows 7 & Mac OSX 10.10 (Yosemite)

Sublime Text Version: 3065

Did you tried to revert to a freshly installed state? No.

Bug Description: UTF-8 variable names are valid within UTF-8 encoded documents for PHP, but the current syntax highlighter does not pick up on this.

How to reproduce:

<?php

    // ASCII Works
    $d = 0;
    echo "$d {$d} ${d}";

    // UTF-8 Does Not
    $Δ = 0;
    echo "$Δ {$Δ} ${Δ}";

?>

simpletext php utf-8 highlighting

FichteFoll commented 9 years ago

Same for Python function/class definitions btw.

MattDMo commented 9 years ago

The reason for this is because the regexes used for creating the proper syntax highlighting scopes are only built to recognize ASCII characters. Changing all of the regexes in each of the .tmLanguage syntax definition files would be a huge undertaking, and I'm not even sure the regex engine supports Unicode regex identifiers, so it may not be possible.

FichteFoll commented 9 years ago

Oniguruma supports unicode afaik (there are sets for it). SyntaxHighlightTools provides a way to substitute reccuring expressions in language definitions. I haven't actually used it as of now (too busy with other stuff) but it might be worth giving it a shot. A converter from conventional definitions might be necessary though which I am willing to write, given enough time and given it works as well as I expect it to.

wbond commented 8 years ago

The latest version of the PHP syntax from https://github.com/sublimehq/Packages supports unicode variable names. It will be part of dev build 3112.