Update benchmarks - Githubissues

DonggeLiu commented 2 weeks ago

Allow generating benchmarks with multiple oracles, prioritizing by oracle order.
Update benchmarks using the new oracles: optimal-targets, all-public-candidates.

@DavidKorczynski It appears that optimal-targets does not work well with many benchmarks. E.g., empty list for libxml2, 'Project not in the database' for avahi. Is this expected?

https://introspector.oss-fuzz.com/api/optimal-targets?project=libxml2&exclude-static-functions=True&only-referenced-functions=False&only-with-header-file-declaration=True
{'functions': [], 'result': 'success'}
https://introspector.oss-fuzz.com/api/optimal-targets?project=avahi&exclude-static-functions=True&only-referenced-functions=False&only-with-header-file-declaration=True
{'msg': 'Project not in the database', 'result': 'error'}

DonggeLiu commented 2 weeks ago

Update benchmarks using the new oracles: optimal-targets, all-public-candidates.

I understood it as you'd like to also add new benchmarks? At the moment it's only removing all

Yep, I need to confirm which oracles to use before generate new benchmarks : )

DavidKorczynski commented 2 weeks ago

Yep, I need to confirm which oracles to use before generate new benchmarks : )

Do: far-reach-but-low-coverage, optimal-targets, easy-params-far-reach

DonggeLiu commented 2 weeks ago

/gcbrun exp -n dg

DonggeLiu commented 2 weeks ago

Looking OK: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-07-05-431-dg-comparison/index.html

DonggeLiu commented 1 week ago

Looking good: https://llm-exp.oss-fuzz.com/Result-reports/scheduled/2024-07-06-weekly-all/

Will select a new comparison set for daily improvement/regression monitoring.

DonggeLiu commented 1 week ago

/gcbrun exp -n dg

DonggeLiu commented 1 week ago

The new comparison set is selected from the 07-06 experiment, containing benchmarks with:

Top performance
Mid/Low performing one
0 build rate (majority).

DonggeLiu commented 1 week ago

/gcbrun skip

google / oss-fuzz-gen

Update benchmarks #431