Closed dirkmueller closed 12 months ago
An 18% compute tradeoff for a ~1% improvement might make sense for repos which are frequently downloaded, but the tradeoff is questionable for repos that will only be used a couple dozen, hundred or perhaps thousand times.
@praiskup Do you have an opinion on this? Would you classify it as positive, negative or basically irrelevant for a service like COPR?
In Copr the "production vs. consumption" ratio is much higher compared to normal distributions, and the compression performance matters. So as long as we could opt-out I think it is a more or less neutral proposal, but without an option this is negative. I'm not sure what would be the effect of this after moving Copr repos to PULP.
@praiskup without further patches, overall createrepo_c is I/O bound, not cpu bound. the shift from 9 to 10 is merely 20% increase which is well in the noise taking I/O overhead into account. I've benchmarked this extensively for Open Build Service usecases which is similar to COPR, and it is neutral.
Copr's createrepo_c
runs are (very often) CPU bound. We use --recycle-pkglist
and similar options most of the time to minimize I/O when re-generating repositories.
@praiskup understood, we're doing the same. still profiling indicated that the compression only adds 30% of the total runtime. anyway, if you feel strongly, I can make it an option, and then I would even be able to go to higher compression levels.
@praiskup just let me know if an extra option for 5% runtime (20% of 30%) is needed, and I'll do a PR
I do not recall all the details from https://github.com/rpm-software-management/createrepo_c/pull/191 now, though you might be right that 70% is XML dumping, and 30% compression. But I/O is often more or less zero (adding one package into a repository with 100k RPMs and more).
Having an option would be nice. I missed that this is already merged; then others probably think it is worth it, let's deal with the option later. We'll report back.
This requires a good amount of saving due to larger dictionary size with only marginal more time effort for compression. decompression time is unaffected.