google / clusterfuzz

Scalable fuzzing infrastructure.
https://google.github.io/clusterfuzz
Apache License 2.0
5.27k stars 551 forks source link

Questions on VM management #1947

Open urbanenomad opened 4 years ago

urbanenomad commented 4 years ago

We have 1 regular VM and bumped up the pre-emptibles VMs to 10. We would like to understand exactly how clusterfuzz manages the jobs and the VMs it runs on especially the pre-emptible VMs.

1) Will A job running on preemptible VM automatically run on another VM if the 1st pre-emptible shuts down? Will clusterfuzz take care of it? 2) Does New Job override old job when we don’t have free cores/cpu? 3) How do we know which VM is running which job? 4) How does clusterfuzz decide how many VMs/Core to allocate for a job ?

oliverchang commented 4 years ago
  1. Yes. Preemptible VMs only run fuzzing tasks, which will be automatically picked up again.
  2. Sorry, don't understand this question. What do you mean by new job vs old job?
  3. You can take a look at /bots.
  4. The job run frequencies are roughly weighted by the number of fuzz targets that are detected by your job. There are other factors such as the engine and sanitizer used, but the number of targets is the main one.
hpathuri commented 4 years ago

Let us say I have 4 VM instances (pre-emptable ones) - Each as 32 COREs /128GB, I submit libFuzzer Job as my first job, and I want it use 1 VM and all COREs on VM, how do I specify that?

Can we run multiple jobs on the same VM?

What is the granular compute power we can specify for a job?

oliverchang commented 4 years ago

You can make that work, but it's not a supported usecase out of the box, as you'd have to start individual instances of ClusterFuzz bots on the one VM for them to be picked up as separate workers. A single bot by design does not run multiple jobs in parallel.

Is there an issue with using more, but smaller VMs with 1 CPU core each ?

hpathuri commented 4 years ago

For AFL, how would these share CORPUS?

hpathuri commented 4 years ago

We want to schedule a job with say N of VMs with CPU for a fuzzing job. How do we specify ?

How do we add more VMs for an existing job or reduce?

hpathuri commented 4 years ago

Would the CORPUS be archived for future if we were to run new jobs leveraging previous CORPUS?

oliverchang commented 4 years ago

For AFL, how would these share CORPUS?

Do you mean if AFL targets will share corpora with libFuzzer? The default behaviour is to share them per https://github.com/google/clusterfuzz/blob/master/src/local/butler/scripts/setup.py#L60.

We want to schedule a job with say N of VMs with CPU for a fuzzing job. How do we specify ? How do we add more VMs for an existing job or reduce?

We don't support specifying specific number of CPUs per job. The model we use is that an entire CF deployment has a pool of resources that's distributed between all the jobs it runs.

Jobs are weighted by the number of targets detected for that job (a multiplier) x a custom weight, which is currently not exposed in the UI. Would that be useful for you?

Would the CORPUS be archived for future if we were to run new jobs leveraging previous CORPUS?

Nothing deletes the corpus on GCS. If the target names still match, new runs will use whatever state the corpus was in before.

hpathuri commented 3 years ago

Could u give some concrete example? Say we provisioned 1000 Cores. We have 20 distinct Software modules that we want to fuzz. At the beginning, we would have 20 different jobs. The expectation is that each job to get 50 VMs. Later on we add 2 additional jobs for module 1 and Module 2. How much compute is assigned for all the jobs associated with module 1 & 2? Would Module 1 & Module 2 get more compute power than other 18 modules?

inferno-chromium commented 3 years ago

This will get equally distributed whenever new jobs are adjusted, see https://github.com/google/clusterfuzz/blob/master/src/appengine/handlers/cron/fuzzer_and_job_weights.py. You can also see weights in FuzzerJob entities. There is slight differnt priorities w.r.t sanitizers and fuzzing engines - see those here - https://github.com/google/clusterfuzz/blob/master/src/appengine/handlers/cron/fuzzer_and_job_weights.py#L47