RemoteExecutionUsed: Avoid false positives with stricter checks

saraadams commented 2 years ago

Currently RemoteExecutionUsedDataProvider is too lenient with how it checks whether remote execution was likely used. This leads to false positives, in particular when using remote caching or a disk cache. Instead of checking for events with the category Remote execution process wall time, which also occurs for non-RE builds, now check for events with the category remote action execution and name execute remotely. This seems to be a stronger signal for remote execution.

Progress on #60

fmeum commented 2 years ago

I tried this on Jazzer with --disk_cache and --noslim_profile and obtained this profile. With both this PR and current main, I get the recommendation to increase jobs to 12, with a local CPU count of 8.

Happy to test again!

saraadams commented 2 years ago

I tried this on Jazzer with --disk_cache and --noslim_profile and obtained this profile. With both this PR and current main, I get the recommendation to increase jobs to 12, with a local CPU count of 8.

Happy to test again!

Thanks for the example profile. I ran the analyzer over it, too, but could not reproduce your report. I get:

Suggestion: "Increase the number of cores"
Add more cores to parallelize actions more. You can achieve this by using a machine with more CPUs or by utilizing remote execution.
An optimal speedup is expected by increasing the number of cores to 12 or more.
Potential improvement
The duration of the invocation can potentially be reduced by 32.99%.
[..................................XXXXXXXXXXXXXXXX]
The invocation's duration might go down to 54s, compared to the current 1m 21s. This assumes the execution phase duration can be reduced to the critical path duration.
Rationale
This invocation's critical path has a duration of 40s, whereas the total execution phase has a duration of 1m 6s. In an ideally parallelized invocation, the critical path dominates the execution phase, but here it takes up only 59.91%. This indicates that actions are not as parallelized as much as they could be.
It looks like 8 cores were used for this invocation.
Caveats
- The number of cores used for this invocation is an approximation. It includes both physical and virtual cores.

It does not suggest setting --jobs, but rather suggests that the invocation would be faster if more cores were available. It also deduced from the profile that likely 8 cores were used.

Can you confirm the commit id you are on when you're receiving a --jobs suggestion from the analyzer?

fmeum commented 2 years ago

@saraadams You are right, I didn't notice that the explanation and title changed. Looks good!

saraadams commented 2 years ago

@saraadams You are right, I didn't notice that the explanation and title changed. Looks good!

Thanks for following up!

EngFlow / bazel_invocation_analyzer

RemoteExecutionUsed: Avoid false positives with stricter checks #64