lydell / js-tokens

Tiny JavaScript tokenizer.
MIT License
502 stars 31 forks source link

matches token starting with digit (as 'name') #2

Closed RoboterHund closed 10 years ago

RoboterHund commented 10 years ago

The regex part corresponding to 'name' matched both a1 and 1a.

For reference, /^[_$a-zA-Z\xA0-\uFFFF][_$a-zA-Z0-9\xA0-\uFFFF]*$/ (from http://stackoverflow.com/a/2008444) matches only a1.

lydell commented 10 years ago

If you mean that you've extracted only the part of the refex that matches names, then yes, that part matches digits in the beginning of names. But it does not when you use the regex as a whole. Let's resolve #1 first.

RoboterHund commented 10 years ago

That's correct, but I was anyway surprised that 1a is tokenized as 1 and a (tested this with the code in #1 with jsString = 'a=1s'; and console.log ({token: token, type: type});).

In my opinion, the expected output is a single token of type 'illegal'.

lydell commented 10 years ago

This is a tokenizer, not a parser. It doesn’t know what tokens may and may not follow another.