envoyproxy / envoy-perf

Envoy performance testing
https://envoyproxy.io
Apache License 2.0
130 stars 36 forks source link

[Salvo] Nighthawk is built every time we test an Envoy image #129

Closed gyohuangxin closed 1 year ago

gyohuangxin commented 2 years ago

With scavenging mode and binary mode, we build Nighthawk every time we test an Envoy image, which wasted much time for rebuilding if we have many images to be tested. https://github.com/envoyproxy/envoy-perf/blob/main/salvo/src/lib/benchmark/scavenging_benchmark.py#L75

gyohuangxin commented 2 years ago

/assign @gyohuangxin

repokitteh-read-only[bot] commented 2 years ago

gyohuangxin is not allowed to assign users.

:cat: Caused by: a https://github.com/envoyproxy/envoy-perf/issues/129#issuecomment-1056372568 was created by @gyohuangxin. see: [more](https://github.com/envoyproxy/envoy-perf/issues/129#issuecomment-1056372568), [trace](https://prod.repokitteh.app/traces/ui/envoyproxy/envoy-perf/6afa1640-99f3-11ec-8682-e98121cfc745).
gyohuangxin commented 2 years ago

@mum4k I'm looking at this issue, can you assign it to me? Thanks!

mum4k commented 2 years ago

Thank you @gyohuangxin, assigned.

xu1zhou commented 2 years ago

Hi @gyohuangxin , I think salvo clones and builds binary under different dirs every time make it unable to reuse the code and cache. I have a idea that we can generate dir name base on hash of salvo job configuration file to make sure same job manages to clone and build at same dir. What's your opinion on this?

gyohuangxin commented 2 years ago

@xu1zhou Yes, you're correct. Nighthawk binary files will be built multiple times in different dirs. I think it's a good idea but it may be a huge change (just my personal guess, haven't verified it yet). Here is another idea which may change less files, we can name the nighthawk build dir based on the salvo job, then check if the nighthawk build dir exists before building Nighthawk.

I haven't much cycle on this issue now, please feel free to take this issue and raise PR. These two methods are both feasible, you can chose either.

xu1zhou commented 2 years ago

@gyohuangxin thanks for your suggestion!

xu1zhou commented 2 years ago

Some points I found after going through the code:

mum4k commented 2 years ago

Thanks for looking into this @xu1zhou. It is true that the current code expects a tempdir object in multiple locations, but that doesn't have to stay that way.

We could define our own object that would selectively wrap tempdir or a selected directory under the same API and pass this object around instead of tempdir. If you choose to do this, please send the definition of our own object and the following refactor from tempdir in separate PRs, so their scope remains fairly small. This will allow us to discuss before we over-invest in an implementation.

gyohuangxin commented 2 years ago

@xu1zhou Thanks for report your findings. I can understand your confusion, envoy-perf/salvo/src/lib/source_tree.py is used for both Nighthawk and Envoy building process, the object changes will also have affects on Envoy.

Looked at TemporaryDirectory function, can we use suffix or prefix arguments to identify whether it is the duplicate nighthawk building? If it doesn't work, as @mum4k mentioned, you can create own objects.

xu1zhou commented 2 years ago

@gyohuangxin tempobject with suffix/prefix arguments only joins specified suffix/prefix with random string as name implies. These objects are different with whole path like /tmp/salvo/prefix-8avnbwxt-suffix

@xu1zhou Thanks for report your findings. I can understand your confusion, envoy-perf/salvo/src/lib/source_tree.py is used for both Nighthawk and Envoy building process, the object changes will also have affects on Envoy.

Looked at TemporaryDirectory function, can we use suffix or prefix arguments to identify whether it is the duplicate nighthawk building? If it doesn't work, as @mum4k mentioned, you can create own objects.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 14 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 year ago

This issue has been automatically closed because it has not had activity in the last 104 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.