trailofbits / ruzzy

A coverage-guided fuzzer for pure Ruby code and Ruby C extensions
GNU Affero General Public License v3.0
80 stars 5 forks source link

Test harness exiting with "Alarm Clock" #22

Open manunio opened 1 month ago

manunio commented 1 month ago

Hi, While fuzzing rexml my test_harness abruptly exits with the message: Alarm clock.

#32768  pulse  cov: 307 ft: 1807 corp: 387/661Kb lim: 57798 exec/s: 57 rss: 879Mb
Alarm clock

Only way to keep the harness running is by trapping the SIGALRM signal. This happens with both sigalstack=0 and sigalstack=1

Signal.trap("SIGALRM") do
 puts "Alarm received!"
end

Any idea why this is happening ? I have noticed the same issue with other ruby libs which includes both (Pure Ruby Code and Ruby C extensions)

# test_harness.rb

require 'ruzzy'
require 'rexml/document'
require 'rexml/parsers/treeparser'

# Signal.trap("SIGALRM") do
#    puts "Alarm received!"
# end

def fuzz_target(input)
  begin
    doc = REXML::Document.new
    parser = REXML::Parsers::TreeParser.new(input, doc)
    parser.parse
  rescue REXML::ParseException
    # pass
  end
end

test_one_input = lambda do |data|
  fuzz_target(data)
  return 0
end

Ruzzy.fuzz(test_one_input)
require 'ruzzy'

Ruzzy.trace('test_harness.rb')

Clang version: Ubuntu clang version 16.0.6 Ruby version: ruby 3.0.2p107 Os: Ubuntu 22.04.4 LTS

mschwager commented 1 month ago

Hi there,

Thanks for opening this issue. LLVM and libfuzzer issues can be tricky to debug. I don't know the exact cause of the issue you're seeing, but I may be able to help point you in the right direction, and we can debug it together.

#32768 pulse cov: 307 ft: 1807 corp: 387/661Kb lim: 57798 exec/s: 57 rss: 879Mb Alarm clock

This looks like a SIGALRM is being throw somewhere.

I have noticed the same issue with other ruby libs which includes both (Pure Ruby Code and Ruby C extensions)

To me this indicates this is likely an LLVM or libfuzzer issue, and not something with Ruby, Ruzzy, or rexml. Although, it'd still be interesting to try to get to the bottom of this in case others run into this issue. I searched both the Ruby codebase and rexml codebase for instances of alarm or SIGALRM and didn't find anything promising. Further, Ruzzy doesn't implement any kind of special functionality for alarms.

From here, the oss-fuzz, sanitizers, and llvm-project repos are good candidates to search for similar issues. For example, searching for SIGALRM brings up these resources:

  1. https://github.com/google/oss-fuzz/issues/671
  2. https://github.com/google/sanitizers/issues/802
  3. https://github.com/search?q=repo%3Allvm%2Fllvm-project%20SIGALRM&type=code

It appears to me that LLVM (Clang, libfuzzer, sanitizers, etc.) can throw SIGALRMs and attempt to receive them. Perhaps the signal is being thrown, and for some reason not being caught in your case.

What happens if you re-run your fuzz job with -timeout=1 set? Perhaps the fuzzing engine is throwing a SIGALRM to indicate a timeout, but it's not correctly being detected by the fuzzer. Maybe we can force it by setting a low timeout value. The default is 1200 (20min). Did your original fuzzing job happen to run for approximately 20min?

Clang version: Ubuntu clang version 16.0.6

Another option here may be running with the latest clang and seeing if you can still reproduce the issue. The LLVM project undergoes a lot of fast development, and this may be a bug that was fixed somewhere along the line. The project Dockerfile currently defaults to LLVM 19, so perhaps the bug is fixed there.

mschwager commented 4 weeks ago

I'm able to reproduce this issue. Some version information:

If I fuzz redcarpet with -timeout=1 then I quickly receive the Alarm clock message. I suspect libFuzzer uses a SIGALRM for the -timeout functionality, but for some reason it's not being handled properly. Perhaps it's not registered in the first place (linking problem?), or perhaps it's being overwritten somewhere.

There's evidence for this in other ecosystems, but no root cause. I suppose this is still achieving the desired effect, even though the UX is not ideal. That is, it's still receiving the SIGALRM and Alarm clock message when the timeout is reached, and stopping fuzzing.

Regardless, setting -timeout=-1 when fuzzing should fix the issue.