ClusterLabs / anvil

The Anvil! Intelligent Availability™ Platform, mark 3
5 stars 6 forks source link

striker-collect-debug compress speed vs size optimization #622

Open fabbione opened 3 months ago

fabbione commented 3 months ago

we recently noticed that debug collection time in CI is extremely high. initially we thought that it was due to collecting screenshots (see #620 ) but in reality that´s not the case.

anvil.logs are in the order of GBs after only one hour run. This takes forever time to compress with bzip2 in default config.

A decent compromise for CI would be to add a new option for striker-collect-debug 'fast-compresss' that will set bzip2 -1 for the final tarball creation.

For a couple of simple benchmarks, this reduced the compress time by half, at the cost of doubling the final tarball size. Customers can than chose what they prefer based on their storage/network settings.

I have a patch ready in case we agree to go down this route.

fabbione commented 3 months ago

[root@an-striker01 ~]# du -sh /tmp/anvil-debug_2024-03-29_07-10-56/ 5.9G /tmp/anvil-debug_2024-03-29_07-10-56/

[root@an-striker01 ~]# uptime 07:33:06 up 2:46, 2 users, load average: 3.59, 3.43, 3.09

fabbione commented 3 months ago

as a temporary workaround I have disable the call to striker-collect-debug logs in CI when the build is successful and should save a long time for now.

digimer commented 3 months ago

Speed is more important in CI, so switching to a faster compression is fine. I recently experimented with lz4, and it seems good to.

fabbione commented 3 months ago

whatever we use, needs to be available in rhel8/9/etc.

digimer commented 3 months ago

I believe lz4 is available, but I have no preference. Whatever is a good balance is fine with me

fabbione commented 3 months ago

it means pulling more packages into anvil deployments. gz might do just fine, or xz.. let´s see. It´s no longer critical with the workaround in place