ethereum / solidity

Solidity, the Smart Contract Programming Language
https://soliditylang.org
GNU General Public License v3.0
23.11k stars 5.73k forks source link

AsmParser: parse source comment using scanner instead of regex #15209

Closed clonker closed 2 months ago

clonker commented 3 months ago

Fixes #15207 Fixes #13496

matheusaaguiar commented 3 months ago

Also, just to confirm/point to, I see that we have some tests in test/libyul/Parser.cpp which seems to cover parsing. Not sure if we need or would benefit from adding something more there.

aarlt commented 3 months ago

I guess you could now also remove the rm ./*_bytecode_too_large_*.sol ./*_combined_too_large_*.sol in .circleci/parallel_bytecode_report.sh - if I remember correctly, the crashes where mainly happening because of the regexp stuff.

cameel commented 3 months ago

I guess you could now also remove the rm ./*_bytecode_too_large_*.sol ./*_combined_too_large_*.sol in .circleci/parallel_bytecode_report.sh - if I remember correctly, the crashes where mainly happening because of the regexp stuff.

Looks like it worked. Which is actually odd, because this PR doesn't fix #13496, does it? I mean, if it does then great, maybe we should close it too, but it's also possible that it got somehow fixed on develop and we didn't realize :)

clonker commented 3 months ago

I guess you could now also remove the rm ./*_bytecode_too_large_*.sol ./*_combined_too_large_*.sol in .circleci/parallel_bytecode_report.sh - if I remember correctly, the crashes where mainly happening because of the regexp stuff.

Looks like it worked. Which is actually odd, because this PR doesn't fix #13496, does it? I mean, if it does then great, maybe we should close it too, but it's also possible that it got somehow fixed on develop and we didn't realize :)

It does segfault for me in the issue you linked above on develop and not on this branch when, e.g., invoking solc --via-ir --asm test.sol. I assume it is because of the debug comments:

image

Invoking the same with --debug-info none does not segfault on develop, either.

aarlt commented 3 months ago

I guess you could now also remove the rm ./*_bytecode_too_large_*.sol ./*_combined_too_large_*.sol in .circleci/parallel_bytecode_report.sh - if I remember correctly, the crashes where mainly happening because of the regexp stuff.

Looks like it worked. Which is actually odd, because this PR doesn't fix #13496, does it? I mean, if it does then great, maybe we should close it too, but it's also possible that it got somehow fixed on develop and we didn't realize :)

It does segfault for me in the issue you linked above on develop and not on this branch when, e.g., invoking solc --via-ir --asm test.sol. I assume it is because of the debug comments:

image

Invoking the same with --debug-info none does not segfault on develop, either.

yep exactly! its because of the debug comments. I ran into this when I was working on the debug attribute parsing stuff. I would say we should merge this soon so I can rework my PR.

aarlt commented 3 months ago

In general it is a bug in std::regexp in some GCC versions that created that segfault.

cameel commented 3 months ago

It does segfault for me in the issue you linked above on develop and not on this branch when, e.g., invoking solc --via-ir --asm test.sol. I assume it is because of the debug comments:

Ah, right. I focused on parsing the long hex string (which should not be affected by this PR) but overlooked the fact that such a string will also end up in a snippet as a part of debug info. In that case, yeah, this PR does fix it.

aarlt commented 2 months ago

Just to leave it here: the segmentation fault was related to a bug in GCC's std::regex implementation. This bug is at least reproducible with GCC 9, 10, 11, 12, 13 & 14.