ostree create-usb performance

mwleeds commented 3 years ago

Following on from the mailing list discussion, I'm opening an issue here so I can upload a few artifacts here.

The discussion is about how the performance of ostree create-usb could be improved, as it is fairly slow.

I tried running it under perf using this command: rm -rf /media/mwleeds/updates/.ostree && perf record -F 99 --call-graph dwarf ostree --repo=/var/lib/flatpak/repo create-usb /media/mwleeds/updates/ org.flathub.Stable runtime/org.videolan.VLC.Plugin.makemkv/x86_64/3-19.08 Here's perf report: Screenshot from 2021-03-06 09-39-20

And it seems much of the time is spent in libz.so doing compression. In this case the source repository has mode bare-user-only and the destination one has mode archive-z2. It's interesting that in the PR which made create-usb use archive mode in the destination repo (#1733) I said it was a performance improvement (in that case I think the source repo would have been mode bare since that's what it is on Endless OS.)

I also tried running create-usb under strace, both with and without --disable-fsync:

rm -rf /media/mwleeds/updates/.ostree/ && strace -yy ostree --repo=/var/lib/flatpak/repo create-usb /media/mwleeds/updates/ org.flathub.Stable runtime/org.videolan.VLC.Plugin.makemkv/x86_64/3-19.08 > strace.log 2>&1

strace.log

rm -rf /media/mwleeds/updates/.ostree/ && strace -yy ostree --repo=/var/lib/flatpak/repo create-usb --disable-fsync /media/mwleeds/updates/ org.flathub.Stable runtime/org.videolan.VLC.Plugin.makemkv/x86_64/3-19.08 > strace-disable-fsync.log 2>&1

strace-disable-fsync.log

That confirmed that --disable-fsync avoids about 115 fsync() calls, as expected (but delaying fsync() to the end of the copy doesn't seem to make a notable difference for performance). Oddly, the log also shows accesses for several refs other than the one specified, such as for org.gimp.GIMP.Manual. So that may merit some investigation.

dbnicholson commented 3 years ago

Hmm, not much I can think to do better there since you have to compress the objects sooner or later. The amount of time looking up symbols in the dynamic linker seems odd, though.

One thing you could do is turn the compression level way down. I've never actually done this, but I believe in the destination repo you would put this in the config:

[archive]
zlib-level=1

Whereas the default is 6. If you're only using archive mode for the handling of metadata and not for the size savings, then this might be a win since you'd presumably spend less time in zlib. See https://github.com/ostreedev/ostree/blob/master/src/libostree/ostree-repo.c#L2964, but I'm not totally sure that's used when pulling objects.

mwleeds commented 3 years ago

Interesting idea but it doesn't seem to make a measurable difference at least for runtime/org.videolan.VLC.Plugin.makemkv/x86_64/3-19.08. With the lower compression level the repo on the drive ended up being 13 MB instead of 12 MB but the copy took slightly more time (13.920s vs 13.767s). I deleted the repo and re-initialized it with ostree --repo=/media/mwleeds/updates/.ostree/repo init --mode=archive before each trial.

mwleeds commented 3 years ago

Sysprof tells a slightly different story than perf, that most of the time is spent in kernel space. Would possibly be worth looking into why __fxstatat accounts for so much. Screenshot from 2021-03-08 19-55-41

In this case the command was: /usr/bin/ostree --repo=/var/lib/flatpak/repo create-usb /media/mwleeds/updates/ org.flathub.Stable app/org.gnome.Builder/x86_64/stable

mwleeds commented 3 years ago

@wjt pointed out that the Sysprof info is probably incomplete due to -fomit-frame-pointer being used to compile Fedora.

mwleeds commented 3 years ago

At hughsie's suggestion, I tried a few different combinations of filesystems and repo modes on the destination drive (the source repo being in bare mode). I deleted and recreated the destination repo in each case and used ostree --repo=/var/lib/flatpak/repo create-usb /run/media/e/updates org.flathub.Stable app/org.libreoffice.LibreOffice/x86_64/stable.

⏳	FAT32	ext4
`archive`	14m56s	1m45s
`bare`	(errored out with `fchown: Operation not permitted`)	(errored out with `fchown: Operation not permitted`)
`bare-user`	(`ostree init` errored out with `fsetxattr(user.ostreemeta): Operation not permitted`)	2m40s
`bare-user-only`	(errored out with `Writing content object: Operation not permitted`)	2m46s

The performance difference in archive mode was remarkable, so I ran it again to be sure and got 14m12s and 1m22s the second time, on FAT32 and ext4 respectively. I also unmounted the filesystem soon after the command completed, to be sure the data was actually copied to it and not in a cache. I guess the question here becomes "Why is create-usb so slow when using FAT?"

dbnicholson commented 3 years ago

Rough. My guess is that something in the kernel fat filesystem is not optimized for the pattern ostree is using. I don't know if you have this kind of control in the real scenario, but you could try formatting the filesystem with exfat. Then make sure exfat-fuse isn't installed so you get the kernel's native exfat filesystem.

Also, I think you can simplify your testing a bit by taking the USB out of the setup. You can just create some loopback filesystems. Like so:

truncate -s 2G fat.img
/sbin/mkfs.fat fat.img
fat_loop=$(sudo losetup --show -f fat.img)
mkdir fat
sudo mount -o uid=$(id -u),gid=$(id -g) $fat_loop fat

truncate -s 2G exfat.img
/sbin/mkfs.exfat exfat.img
exfat_loop=$(sudo losetup --show -f exfat.img)
mkdir exfat
sudo mount -o uid=$(id -u),gid=$(id -g) $exfat_loop exfat

truncate -s 2G ext4.img
/sbin/mkfs.ext4 ext4.img
ext4_loop=$(sudo losetup --show -f ext4.img)
mkdir ext4
sudo mount $ext4_loop ext4
sudo chown $(id -u):$(id -g) ext4

When you're done, umount each one and losetup -d each loop device. You can also consider sync -f for ensuring the data is written without unmounting.

mwleeds commented 3 years ago

Rough. My guess is that something in the kernel fat filesystem is not optimized for the pattern ostree is using. I don't know if you have this kind of control in the real scenario, but you could try formatting the filesystem with exfat. Then make sure exfat-fuse isn't installed so you get the kernel's native exfat filesystem.

Perhaps that's worth trying but unless we're going to require the stick be re-formatted before allowing a copy to it, that won't help for the (majority of?) USB sticks in the wild that are already FAT formatted.

Also, I think you can simplify your testing a bit by taking the USB out of the setup. You can just create some loopback filesystems. Like so:
truncate -s 2G fat.img
/sbin/mkfs.fat fat.img
fat_loop=$(sudo losetup --show -f fat.img)
mkdir fat
sudo mount -o uid=$(id -u),gid=$(id -g) $fat_loop fat

truncate -s 2G exfat.img
/sbin/mkfs.exfat exfat.img
exfat_loop=$(sudo losetup --show -f exfat.img)
mkdir exfat
sudo mount -o uid=$(id -u),gid=$(id -g) $exfat_loop exfat

truncate -s 2G ext4.img
/sbin/mkfs.ext4 ext4.img
ext4_loop=$(sudo losetup --show -f ext4.img)
mkdir ext4
sudo mount $ext4_loop ext4
sudo chown $(id -u):$(id -g) ext4
When you're done, umount each one and losetup -d each loop device. You can also consider sync -f for ensuring the data is written without unmounting.

Thanks that might be useful, but I think we want to stay as close to the real world scenario as possible when gathering data.

I gathered some backtraces with create-usb running under gdb on each filesystem in case they're illuminating but haven't really dug into them yet: create-usb-ext4-backtraces.log create-usb-fat-backtraces.log

dbnicholson commented 3 years ago

@cgwalters let me know if I'm wrong here, but I think the way the data gets copied is via glnx_regfile_copy_bytes. That first tries a reflink, then copy_file_range, then sendfile and finally just a regular read/write loop.

It would be interesting to know which one of these paths fat ends up in. My guess is that it has to fall back to a read/write loop due to lack of support in the filesystem. Not that anything can be done there, but my guess is that the actual copying of data is where this is taking the most time. On the other hand, if your sysprof profile is right, then maybe fstatat is just excessively slow on fat.

mwleeds commented 3 years ago

Someone on StackOverflow suggested the async,flush mount options would help, but it is already mounted with flush by default, and when I manually remount it with async that option doesn't show up in the mount output so I guess it's ignored. It makes sense that it would help for the data to be written to the drive asynchronously with a buffer but I haven't been able to verify yet.

dbnicholson commented 3 years ago

@mwleeds could you try it under strace again but with -T -ttt to get timestamps and the time spent in each syscall? Also probably -f since pull (and glib in general) are multi-threaded.

dbnicholson commented 3 years ago

I wrote a script to test what I was thinking about in https://gist.github.com/dbnicholson/ef93d0cb2b262212566f5e71f04a6502. Here's a strace summary comparison of fat, exfat and ext4:

fat summary:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 25.52    8.188703         306     26740      3663 futex
 25.40    8.151031         303     26815           poll
 15.19    4.874937         108     45113     39691 newfstatat
  7.53    2.415129         144     16742      5420 openat
  4.95    1.588341          87     18131           write
  4.58    1.469256         285      5144           renameat2
  4.34    1.392292          81     17165           read
  2.80    0.898118          79     11326           close
  2.60    0.834847          79     10553     10028 mkdirat
  2.56    0.822568          82      9934           lseek
  1.30    0.418655          81      5147           renameat
  1.29    0.414929          80      5147           fchmod
  1.24    0.396960          82      4790           flistxattr
  0.15    0.048280          30      1606           fstat
  0.13    0.042539          40      1052           getdents64
  0.10    0.030876         120       257           fsync
  0.06    0.018448        9224         2           fstatfs
  0.05    0.015797          64       246           copy_file_range
  0.05    0.014989          59       251       248 ioctl
  0.04    0.012151          11      1021           mprotect
  0.03    0.008154          75       108           llistxattr
  0.02    0.007578          70       108           readlinkat
  0.02    0.007388           6      1058           fcntl
  0.01    0.003794        3794         1           syncfs
  0.01    0.003520          13       263         2 unlinkat
  0.00    0.001152           3       321           mmap
  0.00    0.000831          63        13           clone
  0.00    0.000700          28        25           munmap
  0.00    0.000539          49        11           brk
  0.00    0.000472          33        14           set_robust_list
  0.00    0.000434          39        11           prctl
  0.00    0.000371          33        11           madvise
  0.00    0.000355          32        11           sched_setattr
  0.00    0.000326          29        11           gettid
  0.00    0.000229         114         2           fdatasync
  0.00    0.000022           7         3           fallocate
  0.00    0.000018           0        21         8 stat
  0.00    0.000009           9         1           getrandom
  0.00    0.000005           5         1           utimensat
  0.00    0.000002           2         1           getuid
  0.00    0.000000           0         5           rt_sigaction
  0.00    0.000000           0         3           rt_sigprocmask
  0.00    0.000000           0         3         3 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0         2         2 statfs
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         3           faccessat
  0.00    0.000000           0         6           eventfd2
  0.00    0.000000           0         1           prlimit64
  0.00    0.000000           0         1           sched_getattr
------ ----------- ----------- --------- --------- ----------------
100.00   32.084745         153    209205     59065 total
exfat summary:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 31.09    4.243226         149     28351      3844 futex
 26.60    3.630377         136     26547           poll
 12.63    1.723387          38     45113     39691 newfstatat
  5.90    0.805443          48     16748      5422 openat
  5.33    0.727306          40     17860           write
  4.61    0.629284          37     16900           read
  2.73    0.372877          32     11330           close
  2.69    0.367069          36      9934           lseek
  1.95    0.266417          51      5144           renameat2
  1.92    0.262108          24     10553     10028 mkdirat
  1.37    0.186924          36      5147           fchmod
  1.27    0.173778          36      4790           flistxattr
  0.78    0.106253          20      5147           renameat
  0.38    0.052010          49      1052           getdents64
  0.34    0.045983         178       257           fsync
  0.15    0.020269          12      1606           fstat
  0.05    0.006552          26       246           copy_file_range
  0.05    0.006474           6      1014           mprotect
  0.04    0.005337          21       253       250 ioctl
  0.03    0.003829           3      1058           fcntl
  0.03    0.003782          35       108           readlinkat
  0.02    0.003132          29       108           llistxattr
  0.02    0.002901          10       265         2 unlinkat
  0.02    0.002337           7       321           mmap
  0.02    0.002168        2168         1           syncfs
  0.00    0.000333          12        26           munmap
  0.00    0.000159          14        11           brk
  0.00    0.000106          17         6           eventfd2
  0.00    0.000097           7        13           clone
  0.00    0.000081           7        11           madvise
  0.00    0.000039           1        21         8 stat
  0.00    0.000029           2        11           prctl
  0.00    0.000024           8         3           rt_sigprocmask
  0.00    0.000011           3         3           faccessat
  0.00    0.000011           0        14           set_robust_list
  0.00    0.000010          10         1           sched_getattr
  0.00    0.000009           0        11           sched_setattr
  0.00    0.000008           0        11           gettid
  0.00    0.000003           3         1           getrandom
  0.00    0.000002           0         5           rt_sigaction
  0.00    0.000000           0         3         3 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           fdatasync
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0         1           getuid
  0.00    0.000000           0         2         2 statfs
  0.00    0.000000           0         2           fstatfs
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           utimensat
  0.00    0.000000           0         3         3 fallocate
  0.00    0.000000           0         1           prlimit64
------ ----------- ----------- --------- --------- ----------------
100.00   13.650145          64    210020     59253 total
ext4 summary:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 31.48    2.913297         102     28370      3811 futex
 27.25    2.521613          95     26523           poll
 12.56    1.162287          25     45113     39691 newfstatat
  5.35    0.495051          27     17868           write
  4.39    0.406229          24     16874           read
  3.39    0.313524          27     11591       271 openat
  2.92    0.269855          23     11324           close
  2.81    0.260116          25     10295           fchmod
  2.72    0.251473          25      9934           lseek
  2.09    0.193444          18     10553     10028 mkdirat
  1.68    0.155772          30      5147           linkat
  1.37    0.127162          26      4790           flistxattr
  1.16    0.107166          20      5147           renameat
  0.18    0.016350          10      1606           fstat
  0.17    0.015664          14      1052           getdents64
  0.12    0.011223          43       257           fsync
  0.07    0.006587           6      1058           fcntl
  0.04    0.003740           3      1019           mprotect
  0.04    0.003655          14       246           copy_file_range
  0.04    0.003583          14       250       247 ioctl
  0.03    0.002857        2857         1           syncfs
  0.03    0.002639          24       108           readlinkat
  0.03    0.002635           8       321           mmap
  0.03    0.002543          23       108           llistxattr
  0.03    0.002388           9       261         2 unlinkat
  0.01    0.001196         108        11           madvise
  0.01    0.000573          20        28           munmap
  0.00    0.000333          25        13           clone
  0.00    0.000307          27        11           prctl
  0.00    0.000290          26        11           brk
  0.00    0.000287         143         2           fdatasync
  0.00    0.000256          23        11           gettid
  0.00    0.000256          18        14           set_robust_list
  0.00    0.000170          15        11           sched_setattr
  0.00    0.000162         162         1           execve
  0.00    0.000078          26         3           fallocate
  0.00    0.000045           2        21         8 stat
  0.00    0.000032          10         3           rt_sigprocmask
  0.00    0.000029           4         6           eventfd2
  0.00    0.000024          24         1           utimensat
  0.00    0.000014           4         3           faccessat
  0.00    0.000007           2         3         3 access
  0.00    0.000004           4         1           arch_prctl
  0.00    0.000003           3         1           sched_getattr
  0.00    0.000002           0         5           rt_sigaction
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0         1           getuid
  0.00    0.000000           0         2         2 statfs
  0.00    0.000000           0         2           fstatfs
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           prlimit64
  0.00    0.000000           0         1           getrandom
------ ----------- ----------- --------- --------- ----------------
100.00    9.254921          44    209985     54063 total

Executive summary: everything is much slower on fat.

dbnicholson commented 3 years ago

Well, I think one thing we can say is there's a lot of wasted time in fstatat considering that 88% of them end in errors (presumably ENOENT).

mwleeds commented 3 years ago

I don't know why create-usb on exfat works for you. For me I get this error:

e@endless:~$ time ostree --repo=/var/lib/flatpak/repo create-usb /run/media/e/CA69-5059/ org.flathub.Stable app/org.libreoffice.LibreOffice/x86_64/stable

error: Importing a04334a21363a6f337aeae6efe55af43b5e42e9b108346acba9be6e790425fac.commit: renameat: Function not implemented

real    0m0.121s
user    0m0.074s
sys 0m0.017s

Just wanted to reproduce your results.

dbnicholson commented 3 years ago

What kernel are you running? Is the USB mounted with fuse?

mwleeds commented 3 years ago

What kernel are you running?

$ uname -a
Linux endless 5.8.0-14-generic #15+dev113.1535365beos3.9.2 SMP Sun Jan 31 00:28:42 UTC 2021 x86_64 GNU/Linux

Is the USB mounted with fuse?

Not sure how to check, whatever happens by default?

dbnicholson commented 3 years ago

Hmm, I was running 5.10 from master. Run findmnt /run/media/e/CA69-5059 or just mount and look for the path you want. If it says exfat in the FSTYPE column, it's the kernel's exfat module. If it says fuseblk, then it's exfat-fuse.

mwleeds commented 3 years ago

Hmm, I was running 5.10 from master. Run findmnt /run/media/e/CA69-5059 or just mount and look for the path you want. If it says exfat in the FSTYPE column, it's the kernel's exfat module. If it says fuseblk, then it's exfat-fuse.

It says fuseblk

mwleeds commented 3 years ago

Someone on StackOverflow suggested the async,flush mount options would help, but it is already mounted with flush by default, and when I manually remount it with async that option doesn't show up in the mount output so I guess it's ignored. It makes sense that it would help for the data to be written to the drive asynchronously with a buffer but I haven't been able to verify yet.

I think async is just omitted from the options list since it's the default. And it seems there's about a 20-28% performance improvement when async is used:

e@endless:~$ sudo mount -o remount,sync,rw,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2 /dev/sdb1 /run/media/e/3376-D9B3
e@endless:~$ rm -rf /run/media/e/3376-D9B3/.ostree/
e@endless:~$ time (ostree --repo=/var/lib/flatpak/repo create-usb /run/media/e/3376-D9B3/ org.flathub.Stable app/org.libreoffice.LibreOffice/x86_64/stable && umount /run/media/e/3376-D9B3)

real    23m56.483s
user    1m4.541s
sys 0m33.017s
e@endless:~$ sudo mount -o remount,async,rw,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2 /dev/sdb1 /run/media/e/3376-D9B3
e@endless:~$ rm -rf /run/media/e/3376-D9B3/.ostree/
e@endless:~$ time (ostree --repo=/var/lib/flatpak/repo create-usb /run/media/e/3376-D9B3/ org.flathub.Stable app/org.libreoffice.LibreOffice/x86_64/stable && umount /run/media/e/3376-D9B3)

real    17m12.590s
user    1m1.364s
sys 0m7.077s

I repeated both runs and got 21m43s and 17m27s for sync and async respectively.

It would be neat if ostree create-usb could try remounting the drive with async, but it would probably make more sense for eos-updater-prepare-volume to do it, since that is already running as root (at the cost of other users of ostree create-usb not getting this performance benefit).

dbnicholson commented 3 years ago

I believe that the majority of the fstatat calls are in the pull code to determine if an object is needed or not. First it checks if the object is in the objects directory then if it's in the transaction's staging directory. The optimization that immediately comes to mind is to enumerate all the objects first in a set then do determinations off of that. The staging directory is locked exclusively, so there's no chance of concurrent writers. The objects directory isn't, but then you just have the chance that you might duplicate pulling an object with a concurrent process.

Another thought is whether faccessat might be cheaper than fstatat if all you care about is existence. I don't know if that's actually a win or not, but could be interesting to see if it makes a difference at thousands of calls.

mwleeds commented 3 years ago

Hmm, I was running 5.10 from master. Run findmnt /run/media/e/CA69-5059 or just mount and look for the path you want. If it says exfat in the FSTYPE column, it's the kernel's exfat module. If it says fuseblk, then it's exfat-fuse.

Oh yeah it works for me too on master.

e@endless:~$ mount | grep exfat
/dev/sdb1 on /run/media/e/C612-3A01 type exfat (rw,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0022,iocharset=utf8,errors=remount-ro,uhelper=udisks2)
e@endless:~$ rm -rf /run/media/e/C612-3A01/.ostree/
e@endless:~$ time (ostree --repo=/var/lib/flatpak/repo create-usb /run/media/e/C612-3A01/ org.flathub.Stable app/org.libreoffice.LibreOffice/x86_64/stable && umount /run/media/e/C612-3A01)

real    3m40.622s
user    1m1.350s
sys 0m4.870s

dbnicholson commented 3 years ago

So, I was only half right on the fstatats. It does happen during the pull via ostree_repo_has_object, but then it also happens when writing out the objects via _ostree_repo_has_loose_object. The former is just a wrapper around the latter to check parent repos as well. If you're in a transaction, these check first in the staging directory and then in the objects directory.

During the pull, before the next needed object is pulled, it's checked to see if it's in the repo. That makes sense since you don't want to pull something you already have. During the writing out phase it checks again to skip a potentially expensive write that might not be needed. Again, that makes sense as you don't want to overwrite something with the same content. Still, that's 4 stats per new object.

Since the application has a shared lock on the repo during the pull, then no prune can be run (which would require an exclusive lock). That actually didn't exist when a lot of the pull/write code was written. If you could enumerate all the objects at the beginning of the pull, you could skip all of the stats with the potential that you might duplicate work with a concurrent application doing a pull or other commit. But it should be safe to do since you wouldn't decide to skip an object that might disappear.

dbnicholson commented 3 years ago

The other thing in my strace summary was a lot of time blocking in futex. That would tell me there's a lot of lock contention and when I was browsing through the code it really seemed like the txn_lock is taken a lot to protect the transaction stats. These are all integers that could easily be handled locklessly with the g_atomic_* family, I think. Like this kind of change:

diff --git a/src/libostree/ostree-repo-commit.c b/src/libostree/ostree-repo-commit.c
index 9d5c6d3b..4c521e49 100644
--- a/src/libostree/ostree-repo-commit.c
+++ b/src/libostree/ostree-repo-commit.c
@@ -1453,10 +1451,8 @@ write_metadata_object (OstreeRepo         *self,
     }

   /* Update the stats, note we both wrote one and add to total */
-  g_mutex_lock (&self->txn_lock);
-  self->txn.stats.metadata_objects_written++;
-  self->txn.stats.metadata_objects_total++;
-  g_mutex_unlock (&self->txn_lock);
+  g_atomic_inc (&self->txn.stats.metadata_objects_written);
+  g_atomic_inc (&self->txn.stats.metadata_objects_total);

   if (out_csum)
     *out_csum = ostree_checksum_to_bytes (actual_checksum);

Are atomic integers expensive? Maybe, but I'd wager that it would be worth the reduced contention on a mutex guarding like 10 different fields.

mwleeds commented 3 years ago

On the Endless phabricator instance, I broke up the task of improving the performance of eos-updater-prepare-volume (which uses flatpak create-usb and ostree create-usb) into these subtasks:

T31633 Change {flatpak,ostree} create-usb to use umask()
T31651 Investigate using atomic ints to improve ostree create-usb performance
T31652 Investigate consolidating fstatat() calls in ostree pull code to improve create-usb performance
T31653 Encourage use of USB 3.0+ with "Copy to USB" feature
T31654 Discourage use of FAT32 filesystem with "Copy to USB" feature
T31655 Discourage use of sync mount option for drives used with "Copy to USB" feature
T31656 Investigate possible use of faccessat() in ostree create-usb
T31657 Investigate disabling zlib compression to improve Copy to USB performance

(Sorry the ticket numbers are useless to those outside Endless)

dbnicholson commented 3 years ago

I took a look at using atomic ints for the transaction stats and it was in the noise. Still might be useful since it's silly to guard so many integer fields with a single mutex.

One thing I did confirm is that this isn't a create-usb issue but really a pull-local issue. I ran create-usb and a similar pull-local and the profiles were almost identical. Which makes sense - all of the work is pulling the objects from one repo to the other. About the only other thing that happens is setting up the target repo. That would just make any optimization work simpler.

cgwalters commented 3 years ago

I took a look at using atomic ints for the transaction stats and it was in the noise. Still might be useful since it's silly to guard so many integer fields with a single mutex.

Hmm, yes that could help, though by far the best thing here would be to gather the stats per-thread and then merge them at the end (or periodically). This is how things like the kernel per-interface packet counter stats work. It's a map-reduce model instead of a database model.

Unfortunately implementing that will be tricky because we've basically committed to supporting arbitrary threads created externally in the repo API. We would need to have something like ostree_repo_import_object_with_stats() that passed back that data to the caller, and the pull API to use it...messy.

cgwalters commented 3 years ago

Eventually io-uring is going to propagate far enough that we can hard depend on it. For cases like this it will likely be a huge win, since things like our fstatat() calls can become asynchronous. For now we could probably threadpool those.

ostreedev / ostree

ostree create-usb performance #2291