Closed KamilaBorowska closed 12 years ago
I think I fixed it. Can you check your code with the latest master revision?
Confirmed. Broken. This change any broken non-obvious regexpes, like following:
split //, 'abc';
It was working previously. I'm aware that Perl is extremally context-sensitive language, but now it's second extreme. I don't think that parsing this without extreme ammount of work is possible, but what about checking spaces around /
character (usually you would put space before regexp, but not inside it :P).
It's not that extremely complex, it's a matter of carefully defining after which lexems a regexp is allowed. Right now I put only a pretty general set of symbols but as your example showed we need to add some keywords too, as javascript does. For Perl, I presume, those will be at least split
and return
. Can you tell after which other keywords one may expect a regexp?
Anything with non-empty prototype, except that in this case it would be considered as boolean value of match at $_
(for example, print /abc/
is actually print($_ =~ /abc/)
.
But more seriously, I would also add grep
to this list, as grepping by regexp sort of makes sense... But if you would want completeness, things like my @matches = reverse /\w+/g
also make sense in Perl (list of all words in $_
in reverse order)... yeah... I'm aware it can be annoying... Perl is very ambiguous... and if you will include subroutine prototypes...
It's nearly impossible to highlight Perl correctly. The syntax highlighter used by Github (Pygments) is pretty nice, but totally fails at $a / $b / $c
(but it fails at lot more, like s < > { }
(but it's not code you should write), oh wait... your tool also fails at it... but I think it's topic for separate issue, in fact, I've checked code like this in your highlighter just to find this issue).
Also, I have noticed that this change also broken regexes after statement modifiers (all statement modifiers should be allowed) and other literal keywords like or
. This also should be fixed.
Also: Just a small note, I think you forgot typeof
in that list in JavaScript.
Since highlighter is not a full-blown translator, it allows more freedom in handling syntax. The point to which it really makes sense to try is, basically, highlighting that doesn't break in a spectacular way. For example, if you mistake division for a regexp you have the whole rest of the line highlighted wrong, and it's ugly. But failing to highlight the regexp in some rare case is OK. So I'd be happy with just putting all those keywords you mentioned in a regexp context starter. Or you have better ideas (I saw you forked the repository)?
totally fails at $a / $b / $c (but it fails at lot more, like s < > { } (but it's not code you should write), oh wait... your tool also fails at it...
Can you be more specific? It's better to file a new issue on that.
BTW we have a discussion group, there are more people than here :-).
Added the keywords you mentioned. I'm going to close the issue for now, we can fix any remaining cases as they appear in the wild. Thanks!
Following code will see regexp
/ $b ... $c /
, when it's obviously divide-by operator. Only second line is needed, but I needed to end regexp, so it would be visible (otherwise, everything would be just black. This doesn't happen in JavaScript and other languages with similar regexp syntax (like Ruby).