Open vcarceler opened 3 years ago
Seems pretty obvious it's an OpenZFS bug rather than an ostree bug since ext4 is over 10x faster. However, maybe if you run it under perf
or similar you can see where all the time is going.
/cc @alexlarsson
I don't know how to use perf
but if you write me the command I will try.
I guess is something related with hard links on OpenZFS.
I just tried to:
a) Make a new repo: ostree --repo=repo init
b) Download Linux source code: mkdir tree; cd tree; wget https://github.com/torvalds/linux/archive/v5.10-rc3.tar.gz; tar xzf v5.10-rc3.tar.gz; cd ..
c) Make a commit: time ostree --repo=repo commit --branch=foo tree/
The commit lasts:
EXT4: real 0m16,251s OpenZFS: real 30m11,842s
The developers of OpenZFS are asking for information here: https://github.com/openzfs/zfs/issues/11140
That's a good test. What you want to do is run ostree commit
under perf record
rather than time
. To do this unprivileged you probably need to change a sysctl, but perf
will tell you. Or you can run it all as root. That will save a perf.data
in your current directory. You can probably just send that to the openzfs folks.
To get an idea yourself, run perf report
. I just ran your test and for me most of the time is spent in glib's SHA256 routine since it has to checksum all the files in tree
to figure out what to store them as in the content addressable objects
directory. Next is ext4_mb_regular_allocator
, but that's much lower. Anyways, my report is at https://gist.github.com/dbnicholson/ddc6e015d00cdeb1aed16bf9cd2be38d.
The other thing that can be helpful is to run under strace
like strace -o strace.log ostree ...
. That could be useful to the openzfs developers to see what the actual usage patterns from userspace are. Beware that strace.log
will be big, though. You could probably use a smaller chunk of the tree like tree/*/Documentation
and still get a pretty good idea of what's going on. OSTree doesn't care what the contents of the files are.
Considering the flatpak use case a bit more, ostree commit
is a good stand in for what happens during the downloading phase while cutting out the actual networking part. The other part of installing a flatpak is checking out the files in the commit. For that, try ostree --repo=repo checkout foo out
. Presumably this should be very fast and have very filesystem interaction since the checkout should be done with hardlinks.
So, the roundtrip through commit
and checkout
should be a pretty good simulation for what flatpak is doing without any of the networking. One other thing to note is that flatpak will use a bare-user-only
repo, so you can add --mode=bare-user-only
to the ostree init
call. I don't think it should matter for this test, though.
Ok, repo initialized with: ostree --repo=repo --mode=bare-user-only init
Data downloaded with: mkdir tree; cd tree; wget https://github.com/torvalds/linux/archive/v5.10-rc3.tar.gz; tar xzf v5.10-rc3.tar.gz; cd ..
And the results of perf record ostree --repo=repo commit --branch=foo tree/
run as root on my laptop with OpenZFS: https://gist.github.com/vcarceler/eb645fc70e2f8f69f90785032a1ed4d9
And here is perf.data -> https://cloud.elpuig.xeill.net/index.php/s/D3JCFuoaLdL43oT
And after 17 minutes of strace -o strace.log ostree --repo=repo commit --branch=foo tree/
I obtain this strace.log
-> https://cloud.elpuig.xeill.net/index.php/s/PiS3wEHTm3oIF7D
System information
Describe the problem you're observing
Installing a flatpak application takes too much time surely due to the way that OSTree replicates the files using hard links.
You can observe a lot of iowait time.
Describe how to reproduce the problem
sudo -s apt update apt install flatpak time flatpak install https://dl.flathub.org/repo/appstream/ch.openboard.OpenBoard.flatpakref
Takes 30 seconds on ext4 and 6 minutes on OpenZFS.
Include any warning/errors/backtraces from the system logs
No errors just poor performance.