google / oss-fuzz

OSS-Fuzz - continuous fuzzing for open source software.
https://google.github.io/oss-fuzz
Apache License 2.0
10.55k stars 2.23k forks source link

OSS-Fuzz doesn't seem to fuzz continuously #3864

Closed evverx closed 4 years ago

evverx commented 4 years ago

Judging by https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2020-05-09&date_end=2020-05-21&fuzzer=libFuzzer_systemd_fuzz-unit-file, we're back to fuzzing for three hours a day every other day.

I'd reopen https://github.com/google/oss-fuzz/issues/3014 if I could.

inferno-chromium commented 4 years ago

You are roughly getting 15K hours per day of fuzzing for systemd https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2020-05-09&date_end=2020-05-21&fuzzer=afl&project=systemd https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2020-05-09&date_end=2020-05-21&fuzzer=libFuzzer&project=systemd Has that changed ? Did you add new fuzz targets ? cpus are allocated per project, but maybe there is some new bug, will be investigated after the holidays.

evverx commented 4 years ago

According to those reports yesterday the systemd project got "51.4" (AFL) + "155.5" (libFuzzer) = 206.9 hours of fuzzing. I'm not sure where 15K comes from. Could it be that I'm looking at the wrong column?

As far as I can see, something happened on April 1.

AFL: Screenshot 2020-05-22 at 09 11 46

libFuzzer: Screenshot 2020-05-22 at 09 24 15

evverx commented 4 years ago

Did you add new fuzz targets ?

No, to the best of my knowledge nothing has changed on the systemd side.

FWIW I usually open issues like this when I end up discovering bugs like https://github.com/systemd/systemd/issues/15885 using my laptop much faster than OSS-Fuzz. Though OSS-Fuzz hasn't found those issues yet.

inferno-chromium commented 4 years ago

We would definitely want to investigate this furthur, but first want to understand the total hours, here is what i see.

uYjtQXVTT35

evverx commented 4 years ago

I've never seen anything like it. Here's what I see: Screenshot 2020-05-23 at 00 22 08

evverx commented 4 years ago

Forgot to attach the libfuzzer stats: Screenshot 2020-05-23 at 00 31 29

inferno-chromium commented 4 years ago

@oliverchang - thoughts ? how can stats be different w.r.t user ?

oliverchang commented 4 years ago

Project filters don't actually affect the result for admins -- they filter the UI components next to them to indicate which fuzzers and job types can be selected. For external users, this automatically restricts results to the set of jobs they have access to.

It looks like something happened to the systemd bots on Mar 31 that caused them to hang due to an issue with the GCE metadata servers. + @mbarbella-chromium who is working on better monitoring here. In the meantime I'll restart these bots.

inferno-chromium commented 4 years ago

Thanks to @oliverchang who figured out root cause and fixed in https://github.com/google/clusterfuzz/pull/1799 , this was a pretty bad bug :(

Really appreciate @evverx as always for letting us know about these issues. You have been a very valuable contributor to OSS-Fuzz, thank you! Stats starting to get fixed with https://oss-fuzz.com/fuzzer-stats?group_by=by-day&date_start=2020-05-09&date_end=2020-05-31&fuzzer=libFuzzer_systemd_fuzz-unit-file

Will leave it open for a day or two before closing.

inferno-chromium commented 4 years ago

Seems all ok now.

oliverchang commented 4 years ago

Thanks @evverx. Please let us know if you notice this again. We've fixed a possible root cause but there may be another issue that has not been fully resolved yet.

evverx commented 4 years ago

Thank you! Judging by two bugs OSS-Fuzz has found since those jobs were restarted it seems to be back to normal. Though I'm curious as to why https://oss-fuzz.com/testcase-detail/5162542791131136 hasn't been reported on Monorail.

evverx commented 4 years ago

Other than that OSS-Fuzz opened https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22706 (which is essentially a duplicate of https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22547). I wonder how the backtrace got lost there (assuming it was the reason the bug wasn't deduplicated)

evverx commented 4 years ago

Other than that OSS-Fuzz opened https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22706

I'm not sure why OSS-Fuzz keeps reporting the same bug over and over again. Today it opened https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22726 (which is a duplicate of https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=22547). As far as I can tell, the backtrace wasn't lost there. @inferno-chromium I wonder why it wasn't deduplicated.

inferno-chromium commented 4 years ago

For https://github.com/google/oss-fuzz/issues/3864#issuecomment-635919588, this is the fix - https://github.com/google/clusterfuzz/pull/1803

inferno-chromium commented 4 years ago

https://oss-fuzz.com/testcase-detail/5162542791131136 is reported after 4.5 hr, i remember we keep some 3 hour window atleast to make sure we get similar stacks and bundle them together. This was important for duplication.

evverx commented 4 years ago

For #3864 (comment), this is the fix - google/clusterfuzz#1803

Thank you!

https://oss-fuzz.com/testcase-detail/5162542791131136 is reported after 4.5 hr

Looks like I didn't notice "UTC" in the logs. At the time I thought that more than 7 hours had passed.

inferno-chromium commented 4 years ago

Regarding comment https://github.com/google/oss-fuzz/issues/3864#issuecomment-635442330, I have no clue. basically you have to try that testcase in an MSan build for systemd and see if it reproduces that way (like do you get proper stack or empty stack]. Are you getting good MSan stack testcase or is it empty all the time. Can you try running that fuzzer in msan build (without testcase) and see if it always crashes in a few minutes. Maybe that is some OOM. This can be a bug in MSan setup, OOM or something in MSan build, i have no clue and would need more debugging.

evverx commented 4 years ago

I tried to build that fuzz target with MSan and it always crashes with the same backtrace. That's basically why I was surprised that on ClusterFuzz the backtrace was lost somewhere and judging by the bug report it wasn't always reproducible.

The fuzzer itself is pretty solid in the sense that it's been running for about 10 minutes with no crashes with the seed corpus.

inferno-chromium commented 4 years ago

This backtrace loss is during fuzzing, not in reproduction. Unsure what is going on.

evverx commented 4 years ago

Since it doesn't seem to happen very often I think it's safe to assume it was just a glitch. If some other fuzz target crashes similarly it will probably make sense to take a look at it closer.

evverx commented 4 years ago

FWIW there was another weird crash with no backtrace (with UBsan this time): https://oss-fuzz.com/testcase-detail/5141092482940928

UndefinedBehaviorSanitizer:DEADLYSIGNAL
--
  | ==1==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x0000023c69a8 (pc 0x0000023c69a8 bp 0x7ffcb9e787e0 sp 0x7ffcb9e78778 T1)
  | ==1==The signal is caused by a READ memory access.
  | ==1==Hint: PC is at a non-executable region. Maybe a wild jump?
  | #0 0x23c69a8 in [heap]
  |  
  | UndefinedBehaviorSanitizer can not provide additional info.
  | SUMMARY: UndefinedBehaviorSanitizer: SEGV ([heap]+0x1b99a8)
  | ==1==ABORTING
inferno-chromium commented 4 years ago

FWIW there was another weird crash with no backtrace (with UBsan this time): https://oss-fuzz.com/testcase-detail/5141092482940928

UndefinedBehaviorSanitizer:DEADLYSIGNAL
--
  | ==1==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x0000023c69a8 (pc 0x0000023c69a8 bp 0x7ffcb9e787e0 sp 0x7ffcb9e78778 T1)
  | ==1==The signal is caused by a READ memory access.
  | ==1==Hint: PC is at a non-executable region. Maybe a wild jump?
  | #0 0x23c69a8 in [heap]
  |  
  | UndefinedBehaviorSanitizer can not provide additional info.
  | SUMMARY: UndefinedBehaviorSanitizer: SEGV ([heap]+0x1b99a8)
  | ==1==ABORTING

These unreproducibles ones from a long fuzzing session are hard to debug, only advice i have is to fix all open reproducible bugs from this fuzzer and see if this one stays (you can see crash statistics on testcase detail page). also, please file a new bug for furthur discussion.

evverx commented 4 years ago

only advice i have is to fix all open reproducible bugs from this fuzzer and see if this one stays

All the bugs have been fixed. The bug that I suspect caused this was fixed about 13 hours ago but it looks like OSS-Fuzz hasn't picked up that commit yet. It would be great if OSS-Fuzz could build projects more often than once a day.

also, please file a new bug for furthur discussion.

I'd open a new issue if I had anything I could describe to report :-)

inferno-chromium commented 4 years ago

only advice i have is to fix all open reproducible bugs from this fuzzer and see if this one stays

All the bugs have been fixed. The bug that I suspect caused this was fixed about 13 hours ago but it looks like OSS-Fuzz hasn't picked up that commit yet. It would be great if OSS-Fuzz could build projects more often than once a day.

OSS-Fuzz builder pipeline is planned for a complete rewrite in early Q3, and this feature is part of it. @oliverchang - fyi.

also, please file a new bug for furthur discussion.

I'd open a new issue if I had anything I could describe to report :-)