Open josch opened 2 years ago
Hi @josch,
indeed I have some ideas to mitigate this; I'm currently a bit short on time but I may try something.
Do you have an easy way of reproducing the problem ?
The "easy" way is just to throw a big tarball at it. :smile:
For example here is a big system image: https://mister-muffin.de/reform/target-userland-full.tar
Any luck looking into this?
I've hit this issue as well. For me with a ~10gb tar it seems to basically never complete (on a very powerful machine). vs e.g. virt-make-fs taking ~30 min.
Some quick benchmarks that I did make me think there is something highly nonlinear going on: 100mb ~1s 500mb ~10s 800mb ~27s 900mb ~71s 1gb ~130s
note: these were done with a tar of a single file of the above sizes.
I observed the same non-linear behavior. Since this is breaking my use-case for genext2fs I instead worked on a patch for e2fsprogs that would allow it to use a tarball as input: https://github.com/tytso/e2fsprogs/pull/118
I'm trying to build a 8G image using genimage
and genext2fs -d ...
has been running for at least 30 minutes. I haven't managed to get it to finish yet. I tried using -a rootfs.tar
and ran into a locale issue with a downloaded tar and a segfault on a tar I created.
Switched to mke2fs
using use-mke2fs = "true"
in my genimage.cfg
, which seems to perform without issue and complete in less than a minute
@pamolloy did the local issue look something like this:
archive_read_next_header(): Pathname can't be converted from UTF-8 to current locale.
If yes, maybe try out https://github.com/bestouff/genext2fs/pull/30 and tell me if that fixes your issue?
As for the slowness, I do not know how to fix genext2fs but if you want tarball input, then maybe https://github.com/tytso/e2fsprogs/pull/118 is of interest to you?
Hi,
I'm now using genext2fs with multi-GB tarballs as input. While this works well it also takes several hours on my machine. So I profiled genext2fs: gprof.txt
If I read the profiling output correctly, then most time is spent in the function
allocate()
.Do you have any ideas how to improve the speed by introducing better data structures?