google / oss-fuzz-gen

LLM powered fuzzing via OSS-Fuzz.
Apache License 2.0
780 stars 92 forks source link

Update benchmarks #431

Closed DonggeLiu closed 1 week ago

DonggeLiu commented 2 weeks ago
  1. Allow generating benchmarks with multiple oracles, prioritizing by oracle order.
  2. Update benchmarks using the new oracles: optimal-targets, all-public-candidates.

@DavidKorczynski It appears that optimal-targets does not work well with many benchmarks. E.g., empty list for libxml2, 'Project not in the database' for avahi. Is this expected?

https://introspector.oss-fuzz.com/api/optimal-targets?project=libxml2&exclude-static-functions=True&only-referenced-functions=False&only-with-header-file-declaration=True
{'functions': [], 'result': 'success'}
https://introspector.oss-fuzz.com/api/optimal-targets?project=avahi&exclude-static-functions=True&only-referenced-functions=False&only-with-header-file-declaration=True
{'msg': 'Project not in the database', 'result': 'error'}
DonggeLiu commented 2 weeks ago
  1. Update benchmarks using the new oracles: optimal-targets, all-public-candidates.

I understood it as you'd like to also add new benchmarks? At the moment it's only removing all

Yep, I need to confirm which oracles to use before generate new benchmarks : )

DavidKorczynski commented 2 weeks ago

Yep, I need to confirm which oracles to use before generate new benchmarks : )

Do: far-reach-but-low-coverage, optimal-targets, easy-params-far-reach

DonggeLiu commented 2 weeks ago

/gcbrun exp -n dg

DonggeLiu commented 2 weeks ago

Looking OK: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-07-05-431-dg-comparison/index.html

DonggeLiu commented 1 week ago

Looking good: https://llm-exp.oss-fuzz.com/Result-reports/scheduled/2024-07-06-weekly-all/

Will select a new comparison set for daily improvement/regression monitoring.

DonggeLiu commented 1 week ago

/gcbrun exp -n dg

DonggeLiu commented 1 week ago

The new comparison set is selected from the 07-06 experiment, containing benchmarks with:

  1. Top performance
  2. Mid/Low performing one
  3. 0 build rate (majority).
DonggeLiu commented 1 week ago

/gcbrun skip