enkimute / ganja.js

:triangular_ruler: Javascript Geometric Algebra Generator for Javascript, c++, c#, rust, python. (with operator overloading and algebraic literals) -
MIT License
1.52k stars 107 forks source link

inline goes into endless loop for unicode characters #98

Closed kungfooman closed 3 years ago

kungfooman commented 3 years ago

Minimal example with unicode character π:

txt = `
Algebra(2,0,1,()=>{
    console.log(π)
});`
var tokens = [
        // 0: whitespace/comments
        /^[\s\uFFFF]|^[\u000A\u000D\u2028\u2029]|^\/\/[^\n]*\n|^\/\*[\s\S]*?\*\//g,
        // 1: literal strings
        /^\"\"|^\'\'|^\".*?[^\\]\"|^\'.*?[^\\]\'|^\`[\s\S]*?[^\\]\`/g,
        // 2: literal numbers in scientific notation (with small hack for i and e_ asciimath)
        /^\d+[.]{0,1}\d*[ei][\+\-_]{0,1}\d*|^\.\d+[ei][\+\-_]{0,1}\d*|^e_\d*/g,
        // 3: literal hex, nonsci numbers and regex (surround regex with extra brackets!)
        /^\d+[.]{0,1}\d*[E][+-]{0,1}\d*|^\.\d+[E][+-]{0,1}\d*|^0x\d+|^\d+[.]{0,1}\d*|^\.\d+|^\(\/.*[^\\]\/\)/g,
        // 4: punctuator
        /^(\.Normalized|\.Length|\.\.\.|>>>=|===|!==|>>>|<<=|>>=|=>|\|\||[<>\+\-\*%&|^\/!\=]=|\*\*|\+\+|\-\-|<<|>>|\&\&|\^\^|^[{}()\[\];.,<>\+\-\*%|&^!~?:=\/]{1})/g,
        // 5: identifier
        /^[A-Za-z0-9_]*/g
    ];
tok = [];
resi = [];
while (txt.length) {
        for (t in tokens) {
            if (resi = txt.match(tokens[t])) {
                tok.push([t | 0, resi[0]]);
                txt = txt.slice(resi[0].length);
                break;
            } // tokenise
        }
    }
kungfooman commented 3 years ago

Problem is this code:

"πℇ".match(/^[A-Za-z0-9_]*/g)

Result is length 0:

image

Slice 0 from string and it is unchanged (endless loop filling up tok array with empty ["5",""] tuples