Open Conan-Kudo opened 8 years ago
Createrepo_c does a lot of things in threads (almost everything that could be reasonably parallelized is parallelized) [1] [2] [3]. RPM headers are read and dumped (generation of xml chunks) in threads, checksums are calculated in threads, repodata files are written and compressed simultaneously in threads (one thread per file) also deltarpms are generated in threads.
I looked on this and switched lzma (xz) encoder from lzma_easy_encoder
to threaded lzma_stream_encoder_mt
[4], here are the results:
https://gist.github.com/Tojaj/cfa35863d481e5f3428d
The performance is worse. This has been just a quick and dirty benchmark, I'll take a look on this in more detail soon (I'll try to tweak configuration a bit) but for now, as expected, it seems that adding more threads to already parallelized tasks doesn't bring any performance boost but brings overhead which causes performance degradation.
[1] https://github.com/rpm-software-management/createrepo_c/blob/master/src/dumper_thread.c#L298 [2] https://github.com/rpm-software-management/createrepo_c/blob/master/src/threads.h [3] https://github.com/rpm-software-management/createrepo_c/blob/master/src/deltarpms.c#L302 [4] http://git.tukaani.org/?p=xz.git;a=blob_plain;f=doc/examples/04_compress_easy_mt.c;hb=HEAD
Thank you for the patch, I just tried it on my 4-core machine:
With this patch: real 1m22.385s Without this patch: real 1m35.695s
So indeed the improvement is not great.
I then tried on the Mageia upload server (32 cores) and this is a different story:
With this patch: real 0m50.894s Without this patch: real 1m45.635s
@pterjan Thanks for the details! I'm glad that it helped! I'll keep it this as a compile time option yet, but maybe make it available via command line param in future (once threaded decompression is available for XZ too).
I going to leave this issue open until that time.
@Tojaj As far as I knew, it's also available in xz. It was made available the same time, I thought?
@Conan-Kudo according to author, the release notes were wrong and only compression is supported so far
@Conan-Kudo found it: http://permalink.gmane.org/gmane.comp.compression.xz.devel/226
@Tojaj ah, thanks for the reference.
Suggestion: switch to zlib-ng instead. The speedup is significant and probably comparable or better than multi-threading the current zlib.
https://github.com/zlib-ng/zlib-ng/discussions/871
It is available in Fedora, though not EL.
Also, Xz is no longer really the best option, now that Zstd exists.
It is not a problem but I do not see anyone using zlib-ng
in Fedora. And if we will be the only user in RHEL it means additional work and responsibility for us.
Please consider also required compatibility => requirement of backports to distribution with the longer lifecycle (let say 10 years). A also recommend to test the the performance using a real repository data. Sometimes it can differs.
I'm not actually suggesting it be done immediately, I'm just saying (on a 6 year old RFE) that now that we're in the future this would be a better approach than the original recommendation.
I have actually tried it, not with createrepo_c but separately, and it was roughly twice as fast. With repodata specifically
zlib-ng can be built with a compatible API to the original zlib, so it may be possible to propose as a system-wide change in Fedora (eventually). And then all applications would benefit and no changes would be needed here.
Why not go ahead and propose that for F38?
Would it come with the expectation that I would be responsible for seeing it through? Because I definitely don't have the C background necessary to do that for a core library.
You're underselling yourself, but you should probably involve the maintainer of zlib and zlib-ng in the Change so they can help you.
May this can be useful - here is a link of the change of RPM compression - https://fedoraproject.org/wiki/Changes/Switch_RPMs_to_zstd_compression. I know it is a different thing but may be it can be helpful.
@kontura IMO the switch to zstd as default means we can drop this.
@Conan-Kudo can we close this?
While Mageia was working on implementing usage of createrepo_c to regularly generate rpm-md data, it was discovered that it was much slower on compressing xz metadata than gzip. Looking through the code on compression, it doesn't appear that you're using multi-threaded compression.
As of xz 5.2.0, the multi-threaded API is now stable and can be used in production. The documentation for this was incorrect in 5.2.0 (indicating only decompression support) but that was fixed in 5.2.2.
You may want to implement threaded (de)compression for bzip2 and gzip too.
See the Mageia bug report for more information.