Open agroce opened 2 years ago
Thanks for the report!
Simpler repro:
{
printf '// SPDX-License-Identifier: '
for i in {1..29069}; do
printf x
done
} | solc -
This started happening on 0.8.8. The original repro has some special chars in it and a contract but the bug seems to be specifically in the SPDX comment parsing and a very long comment is enough to trigger it. In GDB I get an enormous stack trace that just keeps printing. It's somewhere in the C++ regex library. I think it recurses too deeply. Probably the pattern we use is not very efficient and backtracks too much.
For me it starts crashing at exactly 29069 x
chars in the comment. I suspect that on different machines the exact limit might be different.
Nice bug ;)
If it's really the regex implementation, we can consider actually pulling in and moving to https://compile-time-regular-expressions.readthedocs.io/ :-). That way we could also speed up the regex patterns in the yul optimizer (during name cleanup)...
I rather suspect catastrophic backtracking. If that's the case then it's the regex itself that's the problem and using a more efficient implementation will only make it require larger input to crash :)
We might switch to a non-backtracking engine (if there's on available) but these typically support a more limited syntax.
Possibly related or even duplicate: https://github.com/ethereum/solidity/issues/13496
This issue was used to create a Solidity compiler bug bounty that looks for new segfault errors on v0.8.27.
Possibly already fixed by https://github.com/ethereum/solidity/pull/15209
Description
segfault.zip
The attached contract produces:
when compiled with solc
On master, using AFL fuzzing. Another discovery using https://github.com/agroce/afl-compiler-fuzzer
Environment
Steps to Reproduce
Above shows pretty clearly, I think.