tact-lang / tact

Tact compiler main repository
https://tact-lang.org
MIT License
371 stars 103 forks source link

fix: improve `funcId` recognition and squash misdetection bugs #636

Closed novusnota closed 2 months ago

novusnota commented 2 months ago

That took a looooooooot of RegEx iterations.

Closes #635

anton-trunov commented 2 months ago

To be on the safe side, let's take stdlib.fc and mathlib.fc as libraries that are commonly used and see if we can add more function identifiers from those libs into our positive tests to make sure people will be able to FFI with those.

novusnota commented 2 months ago

Nice observation! I'd try putting lookaheads & in front of those ")" so they don't get parsed, but let me try your suggestion first, it should work. UPD: Yeah, it does, clever idea!

The keywords are not allowed and should be excluded too. Basically, contents of https://github.com/ton-blockchain/ton/blob/master/crypto/func/keywords.cpp has to be put in the exceptions list. I'm on it.

UPD2: Order of operators in those inner () alternations really does matter, the ones with bigger prefix shall stand in front of others.

novusnota commented 2 months ago
There was a suggestion, which didn't pass negative tests, so I hid it. The thing we've engineered works, but _possibly_ I've found a cleaner solution, which doesn't require the ")" and can be used in the FunC parser too (the current approach in this PR fails there in absence of ")" and stuff). But this one works: ![image](https://github.com/user-attachments/assets/6fe47d83-52e3-442b-a70b-504a7deb14a2) It passes similar set of tests we have for function identifiers, but adapted to FunC code: ![image](https://github.com/user-attachments/assets/61e7a1f9-487a-4996-8764-c994b05dbee5) UPD: Negative tests failed terribly, reverting :)
anton-trunov commented 2 months ago

@novusnota So, what's the current status of this PR?

novusnota commented 2 months ago

@anton-trunov It's a working and robust solution. The alternative would be to do invalidation of identifiers during syntactic analysis, which enhances the error messages further, but that seems to be a bit out of scope of this PR. Wdyt?