lark-parser / lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
MIT License
4.93k stars 416 forks source link

[standalone.py] NameError: name 'RULE' is not defined #212

Closed evandrocoan closed 6 years ago

evandrocoan commented 6 years ago

I am using:

pip3 show lark-parser
Name: lark-parser
Version: 0.6.3

Python 3.6.4

When I generate the standalone parser and try to run it, it throws:

Traceback (most recent call last):
  File "lexer.py", line 601, in <module>
    1: Rule(NonTerminal('language_syntax'), [NonTerminal(Token(RULE, '__anon_star_0')), NonTerminal('preamble_statements')], None, RuleOptions(False, False, None)),
NameError: name 'RULE' is not defined

The generated parse is this: lexer.zip

The line 601 is this:

File: lexer.py
599: RULES = {
600:   0: Rule(NonTerminal('language_syntax'), [NonTerminal('preamble_statements')], None, RuleOptions(False, False, None)),
601:   1: Rule(NonTerminal('language_syntax'), [NonTerminal(Token(RULE, '__anon_star_0')), NonTerminal('preamble_statements')], None, RuleOptions(False, False, None)),

The problem is this RULE constant on the python code which is not defined anywhere by the standalone.py tool.

This is the input grammar used:

language_syntax: _NEWLINE* preamble_statements
preamble_statements: target_language_name_statement | master_scope_name_statement

target_language_name_statement: "name" ":" free_input_string
master_scope_name_statement: "scope" ":" free_input_string

// General terminal constructs
CR: "/r"
LF: "/n"
SPACES: /[\t \f]+/

// General non terminal constructs
free_input_string: /[^\n]/
SINGLE_LINE_COMMENT: /(#|\/\/)[^\n]*/

// _NEWLINE: (CR? LF)+
_NEWLINE: ( /\r?\n[\t ]*/ | SINGLE_LINE_COMMENT )+
INTEGER: "0".."9"

%ignore SPACES
%ignore SINGLE_LINE_COMMENT

This was the command line used to build the standalone parser:

python -m lark.tools.standalone ../simple.lark > lexer.py
erezsh commented 6 years ago

Thanks for the bug report! I pushed a fix to master which should solve this issue.

erezsh commented 6 years ago

Also pushed a new version to pypi. Upgrading to lark-parser==0.6.4 should do the trick.

evandrocoan commented 6 years ago

Thanks! But I got this error: NameError: name 'UnexpectedToken' is not defined

I added a print when using the tokens: https://github.com/evandroforks/lark/commit/b9cf98427d0f7ff103340804eabe5be24faf8f8a You see the 2 first tokens where get:

"[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]"
"[@1,24:54='name: Abstract Machine Language'<__ANON_2>,6:1]"
Traceback (most recent call last):
  File "lexer.py", line 678, in get_action
    return states[state][key]
KeyError: '__ANON_2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main_lexer.py", line 38, in <module>
    test()
  File "main_lexer.py", line 33, in test
    tree = parser.parse(file.read())
  File "lexer.py", line 1097, in parse
    return self.parser.parse(tokens)
  File "lexer.py", line 705, in parse
    action, arg = get_action(token.type)
  File "lexer.py", line 681, in get_action
    raise UnexpectedToken(token, expected, state=state)  # TODO filter out rules from expected
NameError: name 'UnexpectedToken' is not defined

When I run the grammar without generating the parser, it runs correctly.

This is the generated parser, input grammar and the test programs: ...

If you run the file python3 main.py you will import lark and run the example program without problems. But if you call python3 main_lexer.py, it will use the generated parser and you will see the error.

This bellow is the output of running python3 main.py, it runs correctly without any errors. The extra print you see on the beginning are just the tokens information: https://github.com/evandroforks/lark/commit/b9cf98427d0f7ff103340804eabe5be24faf8f8a After all tokens being outputted, you see the full tree generated correctly.

``` $ python3 main.py "[@0,0:1='\n\n'<_NL>,1:1]" "[@1,2:16='language_syntax',3:1]" "[@2,17:17=':'<_COLON>,3:16]" "[@3,19:26='_NEWLINE',3:18]" "[@4,27:27='?',3:26]" "[@5,29:47='preamble_statements',3:28]" "[@6,49:56='_NEWLINE',3:48]" "[@7,57:57='?',3:56]" "[@8,59:82='language_construct_rules',3:58]" "[@9,84:91='_NEWLINE',3:83]" "[@10,92:92='?',3:91]" "[@11,94:94='('<_LPAR>,3:93]" "[@12,96:123='miscellaneous_language_rules',3:95]" "[@13,125:132='_NEWLINE',3:124]" "[@14,133:133='?',3:132]" "[@15,135:135=')'<_RPAR>,3:134]" "[@16,136:136='*',3:135]" "[@17,138:145='_NEWLINE',3:137]" "[@18,146:146='?',3:145]" "[@19,147:148='\n\n'<_NL>,3:146]" "[@20,149:167='preamble_statements',5:1]" "[@21,168:168=':'<_COLON>,5:20]" "[@22,170:170='('<_LPAR>,5:22]" "[@23,172:172='('<_LPAR>,5:24]" "[@24,174:203='target_language_name_statement',5:26]" "[@25,205:205='|'<_OR>,5:57]" "[@26,207:233='master_scope_name_statement',5:59]" "[@27,235:235=')'<_RPAR>,5:87]" "[@28,237:244='_NEWLINE',5:89]" "[@29,246:246=')'<_RPAR>,5:98]" "[@30,247:247='+',5:99]" "[@31,248:248='\n'<_NL>,5:100]" "[@32,249:272='language_construct_rules',6:1]" "[@33,273:273=':'<_COLON>,6:25]" '[@34,275:284=\'"contexts"\',6:27]' '[@35,286:288=\'":"\',6:38]' "[@36,290:306='indentation_block',6:42]" "[@37,307:307='\n'<_NL>,6:59]" "[@38,308:335='miscellaneous_language_rules',7:1]" "[@39,336:336=':'<_COLON>,7:29]" "[@40,338:346='/[^:\\n]+/',7:31]" '[@41,348:350=\'":"\',7:41]' "[@42,352:368='indentation_block',7:45]" "[@43,369:370='\n\n'<_NL>,7:62]" "[@44,371:400='target_language_name_statement',9:1]" "[@45,401:401=':'<_COLON>,9:31]" '[@46,403:408=\'"name"\',9:33]' '[@47,410:412=\'":"\',9:40]' "[@48,414:430='free_input_string',9:44]" "[@49,431:431='\n'<_NL>,9:61]" "[@50,432:458='master_scope_name_statement',10:1]" "[@51,459:459=':'<_COLON>,10:28]" '[@52,461:467=\'"scope"\',10:30]' '[@53,469:471=\'":"\',10:38]' "[@54,473:489='free_input_string',10:42]" "[@55,490:491='\n\n'<_NL>,10:59]" "[@56,583:583='\n'<_NL>,12:92]" "[@57,584:600='indentation_block',13:1]" "[@58,601:601=':'<_COLON>,13:18]" '[@59,603:605=\'"{"\',13:20]' "[@60,607:614='_NEWLINE',13:24]" "[@61,616:616='('<_LPAR>,13:33]" "[@62,618:632='statements_list',13:35]" "[@63,634:641='_NEWLINE',13:51]" "[@64,643:643=')'<_RPAR>,13:60]" "[@65,644:644='+',13:61]" '[@66,646:648=\'"}"\',13:63]' "[@67,649:649='\n'<_NL>,13:66]" "[@68,650:664='statements_list',14:1]" "[@69,665:665=':'<_COLON>,14:16]" "[@70,667:681='match_statement',14:18]" "[@71,683:683='|'<_OR>,14:34]" "[@72,685:701='include_statement',14:36]" "[@73,703:703='|'<_OR>,14:54]" "[@74,705:718='push_statement',14:56]" "[@75,720:720='|'<_OR>,14:71]" "[@76,722:734='pop_statement',14:73]" "[@77,736:736='|'<_OR>,14:87]" "[@78,738:757='meta_scope_statement',14:89]" "[@79,758:759='\n\n'<_NL>,14:109]" "[@80,760:773='push_statement',16:1]" "[@81,774:774=':'<_COLON>,16:15]" '[@82,777:782=\'"push"\',16:18]' '[@83,784:786=\'":"\',16:25]' "[@84,788:804='indentation_block',16:29]" "[@85,805:805='\n'<_NL>,16:46]" "[@86,806:822='include_statement',17:1]" "[@87,823:823=':'<_COLON>,17:18]" '[@88,825:833=\'"include"\',17:20]' '[@89,835:837=\'":"\',17:30]' "[@90,839:855='free_input_string',17:34]" "[@91,856:857='\n\n'<_NL>,17:51]" "[@92,858:873='match_statements',19:1]" "[@93,874:874=':'<_COLON>,19:17]" "[@94,876:885='scope_name',19:19]" "[@95,887:887='|'<_OR>,19:30]" "[@96,889:903='capturing_block',19:32]" "[@97,905:905='|'<_OR>,19:48]" "[@98,907:921='statements_list',19:50]" "[@99,922:922='\n'<_NL>,19:65]" "[@100,923:937='match_statement',20:1]" "[@101,938:938=':'<_COLON>,20:16]" '[@102,940:946=\'"match"\',20:18]' '[@103,948:950=\'":"\',20:26]' "[@104,952:968='free_input_string',20:30]" '[@105,970:972=\'"{"\',20:48]' "[@106,974:974='('<_LPAR>,20:52]" "[@107,976:983='_NEWLINE',20:54]" "[@108,985:1000='match_statements',20:63]" "[@109,1002:1002=')'<_RPAR>,20:80]" "[@110,1003:1003='*',20:81]" "[@111,1005:1012='_NEWLINE',20:83]" '[@112,1014:1016=\'"}"\',20:92]' "[@113,1017:1017='\n'<_NL>,20:95]" "[@114,1018:1027='scope_name',21:1]" "[@115,1028:1028=':'<_COLON>,21:11]" '[@116,1030:1036=\'"scope"\',21:13]' '[@117,1038:1040=\'":"\',21:21]' "[@118,1042:1058='free_input_string',21:25]" "[@119,1059:1060='\n\n'<_NL>,21:42]" "[@120,1061:1075='capturing_block',23:1]" "[@121,1076:1076=':'<_COLON>,23:16]" '[@122,1078:1087=\'"captures"\',23:18]' '[@123,1089:1091=\'":"\',23:29]' '[@124,1093:1095=\'"{"\',23:33]' "[@125,1097:1097='('<_LPAR>,23:37]" "[@126,1099:1106='_NEWLINE',23:39]" "[@127,1108:1122='capturing_lines',23:48]" "[@128,1124:1124=')'<_RPAR>,23:64]" "[@129,1125:1125='+',23:65]" "[@130,1127:1134='_NEWLINE',23:67]" '[@131,1136:1138=\'"}"\',23:76]' "[@132,1139:1139='\n'<_NL>,23:79]" "[@133,1140:1154='capturing_lines',24:1]" "[@134,1155:1155=':'<_COLON>,24:16]" "[@135,1157:1163='INTEGER',24:18]" "[@136,1164:1164='+',24:25]" '[@137,1166:1168=\'":"\',24:27]' "[@138,1170:1186='free_input_string',24:31]" "[@139,1187:1188='\n\n'<_NL>,24:48]" "[@140,1189:1201='pop_statement',26:1]" "[@141,1202:1202=':'<_COLON>,26:14]" '[@142,1204:1208=\'"pop"\',26:16]' '[@143,1210:1212=\'":"\',26:22]' "[@144,1214:1230='free_input_string',26:26]" "[@145,1231:1231='\n'<_NL>,26:43]" "[@146,1232:1251='meta_scope_statement',27:1]" "[@147,1252:1252=':'<_COLON>,27:21]" '[@148,1254:1265=\'"meta_scope"\',27:23]' '[@149,1267:1269=\'":"\',27:36]' "[@150,1271:1287='free_input_string',27:40]" "[@151,1288:1289='\n\n'<_NL>,27:57]" "[@152,1320:1320='\n'<_NL>,29:31]" "[@153,1321:1322='CR',30:1]" "[@154,1323:1323=':'<_COLON>,30:3]" '[@155,1325:1328=\'"/r"\',30:5]' "[@156,1329:1329='\n'<_NL>,30:9]" "[@157,1330:1331='LF',31:1]" "[@158,1332:1332=':'<_COLON>,31:3]" '[@159,1334:1337=\'"/n"\',31:5]' "[@160,1338:1338='\n'<_NL>,31:9]" "[@161,1339:1344='SPACES',32:1]" "[@162,1345:1345=':'<_COLON>,32:7]" "[@163,1347:1356='/[\\t \\f]+/',32:9]" "[@164,1357:1358='\n\n'<_NL>,32:19]" "[@165,1393:1393='\n'<_NL>,34:35]" "[@166,1394:1410='free_input_string',35:1]" "[@167,1411:1411=':'<_COLON>,35:18]" "[@168,1413:1432='/(\\\\{|\\\\}|[^\\n{}])+/',35:20]" "[@169,1433:1434='\n\n'<_NL>,35:40]" "[@170,1435:1453='SINGLE_LINE_COMMENT',37:1]" "[@171,1454:1454=':'<_COLON>,37:20]" "[@172,1456:1472='/(\\#|\\/\\/)[^\\n]*/',37:22]" "[@173,1473:1473='\n'<_NL>,37:39]" "[@174,1474:1491='MULTI_LINE_COMMENT',38:1]" "[@175,1492:1492=':'<_COLON>,38:19]" "[@176,1494:1513='/\\/\\*([\\s\\S]*?)\\*\\//',38:21]" "[@177,1514:1515='\n\n'<_NL>,38:41]" "[@178,1516:1522='NEWLINE',40:1]" "[@179,1523:1523=':'<_COLON>,40:8]" "[@180,1525:1525='('<_LPAR>,40:10]" "[@181,1526:1527='CR',40:11]" "[@182,1528:1528='?',40:13]" "[@183,1530:1531='LF',40:15]" "[@184,1532:1532=')'<_RPAR>,40:17]" "[@185,1533:1533='+',40:18]" "[@186,1534:1534='\n'<_NL>,40:19]" "[@187,1535:1542='_NEWLINE',41:1]" "[@188,1543:1543=':'<_COLON>,41:9]" "[@189,1545:1545='('<_LPAR>,41:11]" "[@190,1547:1559='/\\r?\\n[\\t ]*/',41:13]" "[@191,1561:1561='|'<_OR>,41:27]" "[@192,1563:1581='SINGLE_LINE_COMMENT',41:29]" "[@193,1583:1583=')'<_RPAR>,41:49]" "[@194,1584:1584='+',41:50]" "[@195,1585:1585='\n'<_NL>,41:51]" "[@196,1586:1592='INTEGER',42:1]" "[@197,1593:1593=':'<_COLON>,42:8]" '[@198,1595:1597=\'"0"\',42:10]' "[@199,1598:1598='.'<_DOT>,42:13]" "[@200,1599:1599='.'<_DOT>,42:14]" '[@201,1600:1602=\'"9"\',42:15]' "[@202,1603:1604='\n\n'<_NL>,42:18]" "[@203,1605:1611='%ignore'<_IGNORE>,44:1]" "[@204,1613:1618='SPACES',44:9]" "[@205,1619:1619='\n'<_NL>,44:15]" "[@206,1620:1626='%ignore'<_IGNORE>,45:1]" "[@207,1628:1645='MULTI_LINE_COMMENT',45:9]" "[@208,1646:1646='\n'<_NL>,45:27]" "[@209,1647:1653='%ignore'<_IGNORE>,46:1]" "[@210,1655:1673='SINGLE_LINE_COMMENT',46:9]" "[@211,1674:1675='\n\n'<_NL>,46:28]" "[@212,1676:1683='%declare'<_DECLARE>,48:1]" "[@213,1685:1691='_INDENT',48:10]" "[@214,1693:1699='_DEDENT',48:18]" "[@215,1700:1700='\n'<_NL>,48:25]" "[@216,1717:1717='\n'<_NL>,49:17]" "[@217,1734:1735='\n\n'<_NL>,50:17]" "[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]" "[@1,24:27='name',6:1]" "[@2,28:28=':',6:5]" "[@3,29:54=' Abstract Machine Language'<__ANON_2>,6:6]" "[@4,55:55='\n'<_NEWLINE>,6:32]" "[@5,56:60='scope',7:1]" "[@6,61:61=':',7:6]" "[@7,62:72=' source.sma'<__ANON_2>,7:7]" "[@8,73:73='\n'<_NEWLINE>,7:18]" "[@9,74:81='contexts',8:1]" "[@10,82:82=':',8:9]" "[@11,84:84='{',8:11]" "[@12,85:87='\n '<_NEWLINE>,8:12]" "[@13,88:94='include',9:3]" "[@14,95:95=':',9:10]" "[@15,96:103=' numbers'<__ANON_2>,9:11]" "[@16,104:106='\n '<_NEWLINE>,9:19]" "[@17,107:113='include',10:3]" "[@18,114:114=':',10:10]" "[@19,115:123=' keywords'<__ANON_2>,10:11]" "[@20,124:126='\n '<_NEWLINE>,10:20]" "[@21,127:133='include',11:3]" "[@22,134:134=':',11:10]" "[@23,135:143=' function'<__ANON_2>,11:11]" "[@24,144:146='\n '<_NEWLINE>,11:20]" "[@25,147:153='include',12:3]" "[@26,154:154=':',12:10]" "[@27,155:169=' check_brackets'<__ANON_2>,12:11]" "[@28,170:173='\n\n '<_NEWLINE>,12:26]" "[@29,174:178='match',14:3]" "[@30,179:179=':',14:8]" "[@31,180:193=' (true|false) '<__ANON_2>,14:9]" "[@32,194:194='{',14:23]" "[@33,195:199='\n '<_NEWLINE>,14:24]" "[@34,200:204='scope',15:5]" "[@35,205:205=':',15:10]" "[@36,206:223=' constant.language'<__ANON_2>,15:11]" "[@37,224:226='\n '<_NEWLINE>,15:29]" "[@38,227:227='}',16:3]" "[@39,228:230='\n '<_NEWLINE>,16:4]" "[@40,231:237='include',17:3]" "[@41,238:238=':',17:10]" "[@42,239:268=' Packages/Editor/Consts.syntax'<__ANON_2>,17:11]" "[@43,269:269='\n'<_NEWLINE>,17:41]" "[@44,270:270='}',18:1]" "[@45,271:272='\n\n'<_NEWLINE>,18:2]" "[@46,273:280='function'<__ANON_0>,20:1]" "[@47,281:281=':',20:9]" "[@48,283:283='{',20:11]" "[@49,284:286='\n '<_NEWLINE>,20:12]" "[@50,287:293='include',21:3]" "[@51,294:294=':',21:10]" "[@52,295:314=' function_definition'<__ANON_2>,21:11]" "[@53,315:317='\n '<_NEWLINE>,21:31]" "[@54,318:324='include',22:3]" "[@55,325:325=':',22:10]" "[@56,326:339=' function_call'<__ANON_2>,22:11]" "[@57,340:340='\n'<_NEWLINE>,22:25]" "[@58,341:341='}',23:1]" "[@59,342:343='\n\n'<_NEWLINE>,23:2]" "[@60,344:362='function_definition'<__ANON_0>,25:1]" "[@61,363:363=':',25:20]" "[@62,365:365='{',25:22]" "[@63,366:368='\n '<_NEWLINE>,25:23]" "[@64,369:373='match',26:3]" "[@65,374:374=':',26:8]" "[@66,375:460=' ^[\\s;]*(public|stock|native|forward)\\s+([A-Za-z_]\\w*:\\s*)?([A-Za-z_][\\w_]*)[\\s]*(\\() '<__ANON_2>,26:9]" "[@67,461:461='{',26:95]" "[@68,462:466='\n '<_NEWLINE>,26:96]" "[@69,467:474='captures',27:5]" "[@70,475:475=':',27:13]" "[@71,477:477='{',27:15]" "[@72,478:484='\n '<_NEWLINE>,27:16]" "[@73,485:485='1',28:7]" "[@74,486:486=':',28:8]" "[@75,487:513=' storage.type.function.pawn'<__ANON_2>,28:9]" "[@76,514:520='\n '<_NEWLINE>,28:36]" "[@77,521:521='2',29:7]" "[@78,522:522=':',29:8]" "[@79,523:548=' storage.modifier.tag.pawn'<__ANON_2>,29:9]" "[@80,549:555='\n '<_NEWLINE>,29:35]" "[@81,556:556='3',30:7]" "[@82,557:557=':',30:8]" "[@83,558:590=' support.function.definition.pawn'<__ANON_2>,30:9]" "[@84,591:597='\n '<_NEWLINE>,30:42]" "[@85,598:598='4',31:7]" "[@86,599:599=':',31:8]" "[@87,600:620=' function.parens.pawn'<__ANON_2>,31:9]" "[@88,621:625='\n '<_NEWLINE>,31:30]" "[@89,626:626='}',32:5]" "[@90,627:631='\n '<_NEWLINE>,32:6]" "[@91,632:635='push',33:5]" "[@92,636:636=':',33:9]" "[@93,638:638='{',33:11]" "[@94,639:645='\n '<_NEWLINE>,33:12]" "[@95,646:650='match',34:7]" "[@96,651:651=':',34:12]" "[@97,652:655=' \\) '<__ANON_2>,34:13]" "[@98,656:656='{',34:17]" "[@99,657:665='\n '<_NEWLINE>,34:18]" "[@100,666:670='scope',35:9]" "[@101,671:671=':',35:14]" "[@102,672:692=' function.parens.pawn'<__ANON_2>,35:15]" "[@103,693:701='\n '<_NEWLINE>,35:36]" "[@104,702:704='pop',36:9]" "[@105,705:705=':',36:12]" "[@106,706:710=' true'<__ANON_2>,36:13]" "[@107,711:717='\n '<_NEWLINE>,36:18]" "[@108,718:718='}',37:7]" "[@109,719:725='\n '<_NEWLINE>,37:8]" "[@110,726:732='include',38:7]" "[@111,733:733=':',38:14]" "[@112,734:738=' main'<__ANON_2>,38:15]" "[@113,739:745='\n '<_NEWLINE>,38:20]" "[@114,746:752='include',39:7]" "[@115,753:753=':',39:14]" "[@116,754:768=' function_block'<__ANON_2>,39:15]" "[@117,769:773='\n '<_NEWLINE>,39:30]" "[@118,774:774='}',40:5]" "[@119,775:777='\n '<_NEWLINE>,40:6]" "[@120,778:778='}',41:3]" "[@121,779:781='\n '<_NEWLINE>,41:4]" "[@122,782:786='match',42:3]" "[@123,787:787=':',42:8]" "[@124,788:839=' '^[ ;]*([A-Za-z_]\\w*:)?([A-Za-z_][\\w_]*)[\\s]*(\\()' '<__ANON_2>,42:9]" "[@125,840:840='{',42:61]" "[@126,841:845='\n '<_NEWLINE>,42:62]" "[@127,846:853='captures',43:5]" "[@128,854:854=':',43:13]" "[@129,856:856='{',43:15]" "[@130,857:863='\n '<_NEWLINE>,43:16]" "[@131,864:864='1',44:7]" "[@132,865:865=':',44:8]" "[@133,866:891=' storage.modifier.tag.pawn'<__ANON_2>,44:9]" "[@134,892:898='\n '<_NEWLINE>,44:35]" "[@135,899:899='2',45:7]" "[@136,900:900=':',45:8]" "[@137,901:933=' support.function.definition.pawn'<__ANON_2>,45:9]" "[@138,934:940='\n '<_NEWLINE>,45:42]" "[@139,941:941='3',46:7]" "[@140,942:942=':',46:8]" "[@141,943:963=' function.parens.pawn'<__ANON_2>,46:9]" "[@142,964:968='\n '<_NEWLINE>,46:30]" "[@143,969:969='}',47:5]" "[@144,970:974='\n '<_NEWLINE>,47:6]" "[@145,975:978='push',48:5]" "[@146,979:979=':',48:9]" "[@147,981:981='{',48:11]" "[@148,982:988='\n '<_NEWLINE>,48:12]" "[@149,989:993='match',49:7]" "[@150,994:994=':',49:12]" "[@151,995:998=' \\) '<__ANON_2>,49:13]" "[@152,999:999='{',49:17]" "[@153,1000:1008='\n '<_NEWLINE>,49:18]" "[@154,1009:1013='scope',50:9]" "[@155,1014:1014=':',50:14]" "[@156,1015:1035=' function.parens.pawn'<__ANON_2>,50:15]" "[@157,1036:1044='\n '<_NEWLINE>,50:36]" "[@158,1045:1047='pop',51:9]" "[@159,1048:1048=':',51:12]" "[@160,1049:1053=' true'<__ANON_2>,51:13]" "[@161,1054:1060='\n '<_NEWLINE>,51:18]" "[@162,1061:1061='}',52:7]" "[@163,1062:1068='\n '<_NEWLINE>,52:8]" "[@164,1069:1075='include',53:7]" "[@165,1076:1076=':',53:14]" "[@166,1077:1081=' main'<__ANON_2>,53:15]" "[@167,1082:1088='\n '<_NEWLINE>,53:20]" "[@168,1089:1095='include',54:7]" "[@169,1096:1096=':',54:14]" "[@170,1097:1111=' function_block'<__ANON_2>,54:15]" "[@171,1112:1116='\n '<_NEWLINE>,54:30]" "[@172,1117:1117='}',55:5]" "[@173,1118:1120='\n '<_NEWLINE>,55:6]" "[@174,1121:1121='}',56:3]" "[@175,1122:1122='\n'<_NEWLINE>,56:4]" "[@176,1123:1123='}',57:1]" "[@177,1124:1125='\n\n'<_NEWLINE>,57:2]" "[@178,1126:1139='function_block'<__ANON_0>,59:1]" "[@179,1140:1140=':',59:15]" "[@180,1142:1142='{',59:17]" "[@181,1143:1145='\n '<_NEWLINE>,59:18]" "[@182,1146:1150='match',60:3]" "[@183,1151:1151=':',60:8]" "[@184,1152:1157=' '\\{' '<__ANON_2>,60:9]" "[@185,1158:1158='{',60:15]" "[@186,1159:1163='\n '<_NEWLINE>,60:16]" "[@187,1164:1168='scope',61:5]" "[@188,1169:1169=':',61:10]" "[@189,1170:1204=' punctuation.definition.group.start'<__ANON_2>,61:11]" "[@190,1205:1209='\n '<_NEWLINE>,61:46]" "[@191,1210:1213='push',62:5]" "[@192,1214:1214=':',62:9]" "[@193,1216:1216='{',62:11]" "[@194,1217:1223='\n '<_NEWLINE>,62:12]" "[@195,1224:1233='meta_scope'<__ANON_1>,63:7]" "[@196,1234:1234=':',63:17]" "[@197,1235:1250=' meta.block.pawn'<__ANON_2>,63:18]" "[@198,1251:1257='\n '<_NEWLINE>,63:34]" "[@199,1258:1262='match',64:7]" "[@200,1263:1263=':',64:12]" "[@201,1264:1269=' '\\}' '<__ANON_2>,64:13]" "[@202,1270:1270='{',64:19]" "[@203,1271:1279='\n '<_NEWLINE>,64:20]" "[@204,1280:1284='scope',65:9]" "[@205,1285:1285=':',65:14]" "[@206,1286:1318=' punctuation.definition.group.end'<__ANON_2>,65:15]" "[@207,1319:1327='\n '<_NEWLINE>,65:48]" "[@208,1328:1330='pop',66:9]" "[@209,1331:1331=':',66:12]" "[@210,1332:1336=' true'<__ANON_2>,66:13]" "[@211,1337:1343='\n '<_NEWLINE>,66:18]" "[@212,1344:1344='}',67:7]" "[@213,1345:1351='\n '<_NEWLINE>,67:8]" "[@214,1352:1358='include',68:7]" "[@215,1359:1359=':',68:14]" "[@216,1360:1364=' main'<__ANON_2>,68:15]" "[@217,1365:1371='\n '<_NEWLINE>,68:20]" "[@218,1372:1378='include',69:7]" "[@219,1379:1379=':',69:14]" "[@220,1380:1393=' function_call'<__ANON_2>,69:15]" "[@221,1394:1398='\n '<_NEWLINE>,69:29]" "[@222,1399:1399='}',70:5]" "[@223,1400:1404='\n '<_NEWLINE>,70:6]" "[@224,1405:1411='include',71:5]" "[@225,1412:1412=':',71:12]" "[@226,1413:1419=' main_2'<__ANON_2>,71:13]" "[@227,1420:1422='\n '<_NEWLINE>,71:20]" "[@228,1423:1423='}',72:3]" "[@229,1424:1426='\n '<_NEWLINE>,72:4]" "[@230,1427:1433='include',73:3]" "[@231,1434:1434=':',73:10]" "[@232,1435:1439=' main'<__ANON_2>,73:11]" "[@233,1440:1440='\n'<_NEWLINE>,73:16]" "[@234,1441:1441='}',74:1]" "[@235,1442:1443='\n\n'<_NEWLINE>,74:2]" "[@236,1444:1456='function_call'<__ANON_0>,76:1]" "[@237,1457:1457=':',76:14]" "[@238,1459:1459='{',76:16]" "[@239,1460:1462='\n '<_NEWLINE>,76:17]" "[@240,1463:1467='match',77:3]" "[@241,1468:1468=':',77:8]" "[@242,1469:1503=' \\s*([A-Za-z_][\\w_]*)[\\s|\\\\n]*(\\() '<__ANON_2>,77:9]" "[@243,1504:1504='{',77:44]" "[@244,1505:1509='\n '<_NEWLINE>,77:45]" "[@245,1510:1517='captures',78:5]" "[@246,1518:1518=':',78:13]" "[@247,1520:1520='{',78:15]" "[@248,1521:1527='\n '<_NEWLINE>,78:16]" "[@249,1528:1528='1',79:7]" "[@250,1529:1529=':',79:8]" "[@251,1530:1556=' support.function.call.pawn'<__ANON_2>,79:9]" "[@252,1557:1563='\n '<_NEWLINE>,79:36]" "[@253,1564:1564='2',80:7]" "[@254,1565:1565=':',80:8]" "[@255,1566:1586=' function.parens.pawn'<__ANON_2>,80:9]" "[@256,1587:1591='\n '<_NEWLINE>,80:30]" "[@257,1592:1592='}',81:5]" "[@258,1593:1597='\n '<_NEWLINE>,81:6]" "[@259,1598:1601='push',82:5]" "[@260,1602:1602=':',82:9]" "[@261,1604:1604='{',82:11]" "[@262,1605:1611='\n '<_NEWLINE>,82:12]" "[@263,1612:1616='match',83:7]" "[@264,1617:1617=':',83:12]" "[@265,1618:1621=' \\) '<__ANON_2>,83:13]" "[@266,1622:1622='{',83:17]" "[@267,1623:1631='\n '<_NEWLINE>,83:18]" "[@268,1632:1636='scope',84:9]" "[@269,1637:1637=':',84:14]" "[@270,1638:1658=' function.parens.pawn'<__ANON_2>,84:15]" "[@271,1659:1667='\n '<_NEWLINE>,84:36]" "[@272,1668:1670='pop',85:9]" "[@273,1671:1671=':',85:12]" "[@274,1672:1676=' true'<__ANON_2>,85:13]" "[@275,1677:1683='\n '<_NEWLINE>,85:18]" "[@276,1684:1684='}',86:7]" "[@277,1685:1691='\n '<_NEWLINE>,86:8]" "[@278,1692:1698='include',87:7]" "[@279,1699:1699=':',87:14]" "[@280,1700:1704=' main'<__ANON_2>,87:15]" "[@281,1705:1709='\n '<_NEWLINE>,87:20]" "[@282,1710:1710='}',88:5]" "[@283,1711:1713='\n '<_NEWLINE>,88:6]" "[@284,1714:1714='}',89:3]" "[@285,1715:1715='\n'<_NEWLINE>,89:4]" "[@286,1716:1716='}',90:1]" "[@287,1717:1718='\n\n'<_NEWLINE>,90:2]" "[@288,1719:1725='numbers'<__ANON_0>,92:1]" "[@289,1726:1726=':',92:8]" "[@290,1728:1728='{',92:10]" "[@291,1729:1731='\n '<_NEWLINE>,92:11]" "[@292,1732:1736='match',93:3]" "[@293,1737:1737=':',93:8]" "[@294,1738:1760=' '(\\d+)(\\.\\{2\\})(\\d+)' '<__ANON_2>,93:9]" "[@295,1761:1761='{',93:32]" "[@296,1762:1766='\n '<_NEWLINE>,93:33]" "[@297,1767:1774='captures',94:5]" "[@298,1775:1775=':',94:13]" "[@299,1777:1777='{',94:15]" "[@300,1778:1784='\n '<_NEWLINE>,94:16]" "[@301,1785:1785='1',95:7]" "[@302,1786:1786=':',95:8]" "[@303,1787:1812=' constant.numeric.int.pawn'<__ANON_2>,95:9]" "[@304,1813:1819='\n '<_NEWLINE>,95:35]" "[@305,1820:1820='2',96:7]" "[@306,1821:1821=':',96:8]" "[@307,1822:1856=' keyword.operator.switch-range.pawn'<__ANON_2>,96:9]" "[@308,1857:1863='\n '<_NEWLINE>,96:44]" "[@309,1864:1864='3',97:7]" "[@310,1865:1865=':',97:8]" "[@311,1866:1891=' constant.numeric.int.pawn'<__ANON_2>,97:9]" "[@312,1892:1896='\n '<_NEWLINE>,97:35]" "[@313,1897:1897='}',98:5]" "[@314,1898:1900='\n '<_NEWLINE>,98:6]" "[@315,1901:1901='}',99:3]" "[@316,1902:1905='\n\n '<_NEWLINE>,99:4]" "[@317,1906:1910='match',101:3]" "[@318,1911:1911=':',101:8]" "[@319,1912:1929=' ([-]?0x[\\da-f]+) '<__ANON_2>,101:9]" "[@320,1930:1930='{',101:27]" "[@321,1931:1935='\n '<_NEWLINE>,101:28]" "[@322,1936:1940='scope',102:5]" "[@323,1941:1941=':',102:10]" "[@324,1942:1967=' constant.numeric.hex.pawn'<__ANON_2>,102:11]" "[@325,1968:1970='\n '<_NEWLINE>,102:37]" "[@326,1971:1971='}',103:3]" "[@327,1972:1974='\n '<_NEWLINE>,103:4]" "[@328,1975:1979='match',104:3]" "[@329,1980:1980=':',104:8]" "[@330,1981:1996=' \\b(\\d+\\.\\d+)\\b '<__ANON_2>,104:9]" "[@331,1997:1997='{',104:25]" "[@332,1998:2002='\n '<_NEWLINE>,104:26]" "[@333,2003:2007='scope',105:5]" "[@334,2008:2008=':',105:10]" "[@335,2009:2036=' constant.numeric.float.pawn'<__ANON_2>,105:11]" "[@336,2037:2039='\n '<_NEWLINE>,105:39]" "[@337,2040:2040='}',106:3]" "[@338,2041:2043='\n '<_NEWLINE>,106:4]" "[@339,2044:2048='match',107:3]" "[@340,2049:2049=':',107:8]" "[@341,2050:2060=' \\b(\\d+)\\b '<__ANON_2>,107:9]" "[@342,2061:2061='{',107:20]" "[@343,2062:2066='\n '<_NEWLINE>,107:21]" "[@344,2067:2071='scope',108:5]" "[@345,2072:2072=':',108:10]" "[@346,2073:2098=' constant.numeric.int.pawn'<__ANON_2>,108:11]" "[@347,2099:2101='\n '<_NEWLINE>,108:37]" "[@348,2102:2102='}',109:3]" "[@349,2103:2103='\n'<_NEWLINE>,109:4]" "[@350,2104:2104='}',110:1]" "[@351,2105:2106='\n\n'<_NEWLINE>,110:2]" "[@352,2107:2114='keywords'<__ANON_0>,112:1]" "[@353,2115:2115=':',112:9]" "[@354,2117:2117='{',112:11]" "[@355,2118:2120='\n '<_NEWLINE>,112:12]" "[@356,2121:2125='match',113:3]" "[@357,2126:2126=':',113:8]" "[@358,2127:2152=' \\s*(case\\b([^:\\n]*):)\\s+ '<__ANON_2>,113:9]" "[@359,2153:2153='{',113:35]" "[@360,2154:2158='\n '<_NEWLINE>,113:36]" "[@361,2159:2166='captures',114:5]" "[@362,2167:2167=':',114:13]" "[@363,2169:2169='{',114:15]" "[@364,2170:2176='\n '<_NEWLINE>,114:16]" "[@365,2177:2177='1',115:7]" "[@366,2178:2178=':',115:8]" "[@367,2179:2199=' keyword.control.pawn'<__ANON_2>,115:9]" "[@368,2200:2206='\n '<_NEWLINE>,115:30]" "[@369,2207:2207='2',116:7]" "[@370,2208:2208=':',116:8]" "[@371,2209:2231=' storage.type.vars.pawn'<__ANON_2>,116:9]" "[@372,2232:2236='\n '<_NEWLINE>,116:32]" "[@373,2237:2237='}',117:5]" "[@374,2238:2240='\n '<_NEWLINE>,117:6]" "[@375,2241:2241='}',118:3]" "[@376,2242:2244='\n '<_NEWLINE>,118:4]" "[@377,2245:2249='match',119:3]" "[@378,2250:2250=':',119:8]" "[@379,2251:2269=' (~|&|\\||\\^|<<|>>) '<__ANON_2>,119:9]" "[@380,2270:2270='{',119:28]" "[@381,2271:2275='\n '<_NEWLINE>,119:29]" "[@382,2276:2280='scope',120:5]" "[@383,2281:2281=':',120:10]" "[@384,2282:2311=' keyword.operator.bitwise.pawn'<__ANON_2>,120:11]" "[@385,2312:2314='\n '<_NEWLINE>,120:41]" "[@386,2315:2315='}',121:3]" "[@387,2316:2318='\n '<_NEWLINE>,121:4]" "[@388,2319:2323='match',122:3]" "[@389,2324:2324=':',122:8]" "[@390,2325:2332=' (\\,|;) '<__ANON_2>,122:9]" "[@391,2333:2333='{',122:17]" "[@392,2334:2338='\n '<_NEWLINE>,122:18]" "[@393,2339:2343='scope',123:5]" "[@394,2344:2344=':',123:10]" "[@395,2345:2362=' keyword.coma.pawn'<__ANON_2>,123:11]" "[@396,2363:2365='\n '<_NEWLINE>,123:29]" "[@397,2366:2366='}',124:3]" "[@398,2367:2370='\n\n '<_NEWLINE>,124:4]" "[@399,2371:2375='match',126:3]" "[@400,2376:2376=':',126:8]" "[@401,2377:2385=' (\\{|\\}) '<__ANON_2>,126:9]" "[@402,2386:2386='{',126:18]" "[@403,2387:2391='\n '<_NEWLINE>,126:19]" "[@404,2392:2396='scope',127:5]" "[@405,2397:2397=':',127:10]" "[@406,2398:2416=' keyword.brace.pawn'<__ANON_2>,127:11]" "[@407,2417:2419='\n '<_NEWLINE>,127:30]" "[@408,2420:2420='}',128:3]" "[@409,2421:2421='\n'<_NEWLINE>,128:4]" "[@410,2422:2422='}',129:1]" "[@411,2423:2424='\n\n'<_NEWLINE>,129:2]" "[@412,2425:2430='parens'<__ANON_0>,131:1]" "[@413,2431:2431=':',131:7]" "[@414,2433:2433='{',131:9]" "[@415,2434:2436='\n '<_NEWLINE>,131:10]" "[@416,2437:2441='match',132:3]" "[@417,2442:2442=':',132:8]" "[@418,2443:2446=' \\( '<__ANON_2>,132:9]" "[@419,2447:2447='{',132:13]" "[@420,2448:2452='\n '<_NEWLINE>,132:14]" "[@421,2453:2457='scope',133:5]" "[@422,2458:2458=':',133:10]" "[@423,2459:2470=' parens.pawn'<__ANON_2>,133:11]" "[@424,2471:2475='\n '<_NEWLINE>,133:23]" "[@425,2476:2479='push',134:5]" "[@426,2480:2480=':',134:9]" "[@427,2482:2482='{',134:11]" "[@428,2483:2489='\n '<_NEWLINE>,134:12]" "[@429,2490:2499='meta_scope'<__ANON_1>,135:7]" "[@430,2500:2500=':',135:17]" "[@431,2501:2513=' meta.group.c'<__ANON_2>,135:18]" "[@432,2514:2520='\n '<_NEWLINE>,135:31]" "[@433,2521:2525='match',136:7]" "[@434,2526:2526=':',136:12]" "[@435,2527:2530=' \\) '<__ANON_2>,136:13]" "[@436,2531:2531='{',136:17]" "[@437,2532:2540='\n '<_NEWLINE>,136:18]" "[@438,2541:2545='scope',137:9]" "[@439,2546:2546=':',137:14]" "[@440,2547:2558=' parens.pawn'<__ANON_2>,137:15]" "[@441,2559:2567='\n '<_NEWLINE>,137:27]" "[@442,2568:2570='pop',138:9]" "[@443,2571:2571=':',138:12]" "[@444,2572:2576=' true'<__ANON_2>,138:13]" "[@445,2577:2583='\n '<_NEWLINE>,138:18]" "[@446,2584:2584='}',139:7]" "[@447,2585:2591='\n '<_NEWLINE>,139:8]" "[@448,2592:2598='include',140:7]" "[@449,2599:2599=':',140:14]" "[@450,2600:2604=' main'<__ANON_2>,140:15]" "[@451,2605:2609='\n '<_NEWLINE>,140:20]" "[@452,2610:2610='}',141:5]" "[@453,2611:2613='\n '<_NEWLINE>,141:6]" "[@454,2614:2614='}',142:3]" "[@455,2615:2615='\n'<_NEWLINE>,142:4]" "[@456,2616:2616='}',143:1]" "[@457,2617:2618='\n\n'<_NEWLINE>,143:2]" "[@458,2619:2632='check_brackets'<__ANON_0>,145:1]" "[@459,2633:2633=':',145:15]" "[@460,2635:2635='{',145:17]" "[@461,2636:2638='\n '<_NEWLINE>,145:18]" "[@462,2639:2643='match',146:3]" "[@463,2644:2644=':',146:8]" "[@464,2645:2648=' \\) '<__ANON_2>,146:9]" "[@465,2649:2649='{',146:13]" "[@466,2650:2654='\n '<_NEWLINE>,146:14]" "[@467,2655:2659='scope',147:5]" "[@468,2660:2660=':',147:10]" "[@469,2661:2694=' invalid.illegal.stray-bracket-end'<__ANON_2>,147:11]" "[@470,2695:2697='\n '<_NEWLINE>,147:45]" "[@471,2698:2698='}',148:3]" "[@472,2699:2699='\n'<_NEWLINE>,148:4]" "[@473,2700:2700='}',149:1]" "[@474,2701:2703='\n\n\n'<_NEWLINE>,149:2]" "[@475,3428:3428='\n'<_NEWLINE>,152:725]" language_syntax preamble_statements target_language_name_statement free_input_string Abstract Machine Language master_scope_name_statement free_input_string source.sma language_construct_rules indentation_block statements_list include_statement free_input_string numbers statements_list include_statement free_input_string keywords statements_list include_statement free_input_string function statements_list include_statement free_input_string check_brackets statements_list match_statement free_input_string (true|false) match_statements scope_name free_input_string constant.language statements_list include_statement free_input_string Packages/Editor/Consts.syntax miscellaneous_language_rules function indentation_block statements_list include_statement free_input_string function_definition statements_list include_statement free_input_string function_call miscellaneous_language_rules function_definition indentation_block statements_list match_statement free_input_string ^[\s;]*(public|stock|native|forward)\s+([A-Za-z_]\w*:\s*)?([A-Za-z_][\w_]*)[\s]*(\() match_statements capturing_block capturing_lines 1 free_input_string storage.type.function.pawn capturing_lines 2 free_input_string storage.modifier.tag.pawn capturing_lines 3 free_input_string support.function.definition.pawn capturing_lines 4 free_input_string function.parens.pawn match_statements statements_list push_statement indentation_block statements_list match_statement free_input_string \) match_statements scope_name free_input_string function.parens.pawn match_statements statements_list pop_statement free_input_string true statements_list include_statement free_input_string main statements_list include_statement free_input_string function_block statements_list match_statement free_input_string '^[ ;]*([A-Za-z_]\w*:)?([A-Za-z_][\w_]*)[\s]*(\()' match_statements capturing_block capturing_lines 1 free_input_string storage.modifier.tag.pawn capturing_lines 2 free_input_string support.function.definition.pawn capturing_lines 3 free_input_string function.parens.pawn match_statements statements_list push_statement indentation_block statements_list match_statement free_input_string \) match_statements scope_name free_input_string function.parens.pawn match_statements statements_list pop_statement free_input_string true statements_list include_statement free_input_string main statements_list include_statement free_input_string function_block miscellaneous_language_rules function_block indentation_block statements_list match_statement free_input_string '\{' match_statements scope_name free_input_string punctuation.definition.group.start match_statements statements_list push_statement indentation_block statements_list meta_scope_statement free_input_string meta.block.pawn statements_list match_statement free_input_string '\}' match_statements scope_name free_input_string punctuation.definition.group.end match_statements statements_list pop_statement free_input_string true statements_list include_statement free_input_string main statements_list include_statement free_input_string function_call match_statements statements_list include_statement free_input_string main_2 statements_list include_statement free_input_string main miscellaneous_language_rules function_call indentation_block statements_list match_statement free_input_string \s*([A-Za-z_][\w_]*)[\s|\\n]*(\() match_statements capturing_block capturing_lines 1 free_input_string support.function.call.pawn capturing_lines 2 free_input_string function.parens.pawn match_statements statements_list push_statement indentation_block statements_list match_statement free_input_string \) match_statements scope_name free_input_string function.parens.pawn match_statements statements_list pop_statement free_input_string true statements_list include_statement free_input_string main miscellaneous_language_rules numbers indentation_block statements_list match_statement free_input_string '(\d+)(\.\{2\})(\d+)' match_statements capturing_block capturing_lines 1 free_input_string constant.numeric.int.pawn capturing_lines 2 free_input_string keyword.operator.switch-range.pawn capturing_lines 3 free_input_string constant.numeric.int.pawn statements_list match_statement free_input_string ([-]?0x[\da-f]+) match_statements scope_name free_input_string constant.numeric.hex.pawn statements_list match_statement free_input_string \b(\d+\.\d+)\b match_statements scope_name free_input_string constant.numeric.float.pawn statements_list match_statement free_input_string \b(\d+)\b match_statements scope_name free_input_string constant.numeric.int.pawn miscellaneous_language_rules keywords indentation_block statements_list match_statement free_input_string \s*(case\b([^:\n]*):)\s+ match_statements capturing_block capturing_lines 1 free_input_string keyword.control.pawn capturing_lines 2 free_input_string storage.type.vars.pawn statements_list match_statement free_input_string (~|&|\||\^|<<|>>) match_statements scope_name free_input_string keyword.operator.bitwise.pawn statements_list match_statement free_input_string (\,|;) match_statements scope_name free_input_string keyword.coma.pawn statements_list match_statement free_input_string (\{|\}) match_statements scope_name free_input_string keyword.brace.pawn miscellaneous_language_rules parens indentation_block statements_list match_statement free_input_string \( match_statements scope_name free_input_string parens.pawn match_statements statements_list push_statement indentation_block statements_list meta_scope_statement free_input_string meta.group.c statements_list match_statement free_input_string \) match_statements scope_name free_input_string parens.pawn match_statements statements_list pop_statement free_input_string true statements_list include_statement free_input_string main miscellaneous_language_rules check_brackets indentation_block statements_list match_statement free_input_string \) match_statements scope_name free_input_string invalid.illegal.stray-bracket-end ```


erezsh commented 6 years ago

You're right, there were several issues with the standalone generator. The tests for it aren't as rigorous as the library itself, unfortunately.

I fixed many of them, hopefully all of them, and pushed it to master. Please let me know if there's anything else.

evandrocoan commented 6 years ago

Now running main_lexer.py throw the error:

"[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]"
"[@1,24:54='name: Abstract Machine Language'<__ANON_2>,6:1]"
Traceback (most recent call last):
  File "lexer.py", line 783, in get_action
    return states[state][key]
KeyError: '__ANON_2'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main_lexer.py", line 38, in <module>
    test()
  File "main_lexer.py", line 33, in test
    tree = parser.parse(file.read())
  File "lexer.py", line 1203, in parse
    return self.parser.parse(tokens)
  File "lexer.py", line 810, in parse
    action, arg = get_action(token.type)
  File "lexer.py", line 786, in get_action
    raise UnexpectedToken(token, expected, state=state)  # TODO filter out rules from expected
lexer.UnexpectedToken: Unexpected token Token(__ANON_2, 'name: Abstract Machine Language') at line 6, column 1.
Expected: preamble_statements, NAME, __anon_plus_1, master_scope_name_statement, SCOPE, target_language_name_statement

Comparing the first 2 tokens the standalone.py generates, with the ones from dynamically running it:

python3 main_lexer.py does not get right the second token. It should be NAME

"[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]"
"[@1,24:54='name: Abstract Machine Language'<__ANON_2>,6:1]"
Traceback (most recent call last):
...

-->

python3 main.py generates and runs the first tokens correctly.

"[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]"
"[@1,24:27='name'<NAME>,6:1]"
"[@2,28:28=':'<COLON>,6:5]"
"[@3,29:54=' Abstract Machine Language'<__ANON_2>,6:6]"
"[@4,55:55='\n'<_NEWLINE>,6:32]"
"[@5,56:60='scope'<SCOPE>,7:1]"
"[@6,61:61=':'<COLON>,7:6]"
"[@7,62:72=' source.sma'<__ANON_2>,7:7]"
"[@8,73:73='\n'<_NEWLINE>,7:18]"
"[@9,74:81='contexts'<CONTEXTS>,8:1]"
...

Test files: ...

erezsh commented 6 years ago

I think I know what the problem is. Can you try doing this parse in the library (not as a standalone) with lexer='standard' ?

evandrocoan commented 6 years ago

It did the same problem, i.e., generated the same tokens:

``` "[@0,0:23='\n\n\n// Example program:\n\n'<_NEWLINE>,1:1]" "[@1,24:54='name: Abstract Machine Language'<__ANON_2>,6:1]" Traceback (most recent call last): File "d:\lark\parsers\lalr_parser.py", line 46, in get_action return states[state][key] KeyError: '__ANON_2' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "main.py", line 88, in test() File "main.py", line 83, in test tree = meu_parser.parse(file.read()) File "d:\lark\lark.py", line 223, in parse return self.parser.parse(text) File "d:\lark\parser_frontends.py", line 38, in parse return self.parser.parse(token_stream, *[sps] if sps is not NotImplemented else []) File "d:\lark\parsers\lalr_parser.py", line 73, in parse action, arg = get_action(token.type) File "d:\lark\parsers\lalr_parser.py", line 49, in get_action raise UnexpectedToken(token, expected, state=state) # TODO filter out rules from expected lark.exceptions.UnexpectedToken: Unexpected token Token(__ANON_2, 'name: Abstract Machine Language') at line 6, column 1. Expected: preamble_statements, __anon_plus_1, master_scope_name_statement, SCOPE, NAME, target_language_name_statement ```


erezsh commented 6 years ago

Sorry for not replying sooner. The reason that the generated parser doesn't work, is that right now it only supports the "standard" lexer, while your grammar requires the "contextual" lexer.

It shouldn't be too difficult to add, but it will take a few hours of work. Hopefully I'll get to it soon.

evandrocoan commented 6 years ago

Thanks for taking time to look into it! No problem for not replying sooner, I am just glad you can work on it.

erezsh commented 6 years ago

Pushed some code to master that should fix this. Let me know if it works (or if it doesn't)

evandrocoan commented 6 years ago

It is working now, Thanks!