intel / yarpgen

Yet Another Random Program Generator
Other
484 stars 53 forks source link

How to expose crashes in GCC/Clang. #199

Closed zxcuiop closed 5 months ago

zxcuiop commented 5 months ago

This work is excellent and has already uncovered many bugs in modern compilers. Thank you very much for your efforts.

I am using YARPGen to find crashes in GCC and Clang and have encountered a minor problem. Specifically, my process is as follows:

D=$RANDOM
mkdir -p $D
./yarpgen -o $D --std=c
gcc-14 -O2 $D/{driver,func}.c 
# Check if GCC crashes
clang-18 -O2 $D/{driver,func}.c
# Check if Clang crashes

I have been running these commands for approximately 30 CPU days but have not found any crashes in GCC or Clang. What might be the issue? Any suggestions would be appreciated.

dbabokin commented 5 months ago
  1. Looks like you are testing release versions of the compilers. They are heavily tested, including YARPGen, so finding anything new. Most of the bugs that we have found were against trunk version - i.e. you need to build the top of the trunk yourself and test it.
  2. -O2 is a good default target for testing, but it's the most tested. Try -O3, try targeting more advanced hardware that generic x86 (I assume you are on x86), i.e. use -march=<your CPU arch> (see here hat we've been testing), you can also check what compiler developers are currently working on and what is not yet enabled by default and try enabling it through command line.
  3. Try also c++, not just c.
  4. You are only checking for compiler crashes, but not checking for miscompilations - good portion of the bug that we found was a miscompilation. I.e. You would need to compile with -O0 and -O2 and compare the result (or compare gcc and clang). YARPGen was designed primarily to be able to target miscompilations.
regehr commented 5 months ago

I'll just add that miscompilations are the most damaging compiler bugs by far, so please look for these!

Also, 30 cpu-days is not that much, yarpgen will benefit from being parallelized across hundreds of cores.

Vsevolod-Livinskij commented 5 months ago

You might also want to use the run_gen.py script, as described here, to parallelize the testing automatically and detect both compiler crashes and miscompilations. I would say that the current estimate to find an error with -O3 -march=skx compilation options in the trunk of clang can easily be in thousands of CPU hours.

zxcuiop commented 5 months ago

Thank you all for your detailed and insightful responses.

@dbabokin, your explanation regarding the focus on trunk versions versus release versions of compilers is invaluable. The suggestions to explore different optimization levels, target specific architectures, and expand the scope of testing beyond just C are particularly helpful. I will begin implementing these adjustments to our testing strategy immediately.

@regehr, emphasizing the critical importance of detecting miscompilations has shifted my perspective significantly. We will enhance our tests to cover these scenarios more thoroughly and consider leveraging additional computational resources to scale our efforts.

@Vsevolod-Livinskij, I appreciate your recommendation to use the run_gen.py script for better automation and efficiency in our tests. The insight into the expected CPU hours needed for finding errors with advanced compilation options is especially enlightening and will help in setting realistic expectations.

Thanks again to each of you for your generosity in sharing your expertise. Your guidance is instrumental in advancing our project, and I look forward to applying these best practices to achieve more meaningful results.

dbabokin commented 5 months ago

@zxcuiop We would appreciate if you updates us on your progress!

And one more thing - do release+assertions build of compilers instead of just release - this would help finding more crashes.

zxcuiop commented 5 months ago

@dbabokin Thanks for the suggestion. We'll try incorporating the release+assertions build of the compilers to increase our chances of uncovering compiler crashes. I'll update when we have significant results. Appreciate your continued support!

zxcuiop commented 5 months ago

@dbabokin Thank you very much! I followed one of your suggestions and successfully found a crash in Clang. I compile the generated programs with the options -O3 -march=skx for a few hours, and discovered a crash caused by a bug in clang within llvm::InnerLoopVectorizer::collectPoisonGeneratingRecipes. Unfortunately, this crash was repeatedly discovered in the later several days, and due to an oversight in the script, I deleted the original file, otherwise I would have had the chance to post the file that triggered the bug.

Just testing one of your suggestions proved to be very effective; I expect that the other suggestions will be even more so. Thank you very much for your advice.

dbabokin commented 5 months ago

@zxcuiop Congratulations! If you have a seed somewhere in your logs, this should be enough to recreate the test.

zxcuiop commented 5 months ago

@dbabokin Thanks for the tip! I’ll dive back into the logs and see if I can find that seed. Really appreciate your help throughout this process. Looking forward to applying more of your suggestions!