bestouff / genext2fs

genext2fs - ext2 filesystem generator for embedded systems
GNU General Public License v2.0
51 stars 31 forks source link

genext2fs is painfully slow for multi-GB input #31

Open josch opened 2 years ago

josch commented 2 years ago

Hi,

I'm now using genext2fs with multi-GB tarballs as input. While this works well it also takes several hours on my machine. So I profiled genext2fs: gprof.txt

If I read the profiling output correctly, then most time is spent in the function allocate().

Do you have any ideas how to improve the speed by introducing better data structures?

bestouff commented 2 years ago

Hi @josch,

indeed I have some ideas to mitigate this; I'm currently a bit short on time but I may try something.
Do you have an easy way of reproducing the problem ?

josch commented 2 years ago

The "easy" way is just to throw a big tarball at it. :smile:

For example here is a big system image: https://mister-muffin.de/reform/target-userland-full.tar

gelrom commented 1 year ago

Any luck looking into this?

I've hit this issue as well. For me with a ~10gb tar it seems to basically never complete (on a very powerful machine). vs e.g. virt-make-fs taking ~30 min.

gelrom commented 1 year ago

Some quick benchmarks that I did make me think there is something highly nonlinear going on: 100mb ~1s 500mb ~10s 800mb ~27s 900mb ~71s 1gb ~130s

note: these were done with a tar of a single file of the above sizes.

josch commented 1 year ago

I observed the same non-linear behavior. Since this is breaking my use-case for genext2fs I instead worked on a patch for e2fsprogs that would allow it to use a tarball as input: https://github.com/tytso/e2fsprogs/pull/118

pamolloy commented 1 year ago

I'm trying to build a 8G image using genimage and genext2fs -d ... has been running for at least 30 minutes. I haven't managed to get it to finish yet. I tried using -a rootfs.tar and ran into a locale issue with a downloaded tar and a segfault on a tar I created.

pamolloy commented 1 year ago

Switched to mke2fs using use-mke2fs = "true" in my genimage.cfg, which seems to perform without issue and complete in less than a minute

josch commented 1 year ago

@pamolloy did the local issue look something like this:

archive_read_next_header(): Pathname can't be converted from UTF-8 to current locale.

If yes, maybe try out https://github.com/bestouff/genext2fs/pull/30 and tell me if that fixes your issue?

As for the slowness, I do not know how to fix genext2fs but if you want tarball input, then maybe https://github.com/tytso/e2fsprogs/pull/118 is of interest to you?