RubenKelevra / pacman.store

Pacman Mirror via IPFS for ArchLinux, Endeavouros, Manjaro plus custom repos ALHP and Chaotic-AUR.
GNU General Public License v3.0
109 stars 5 forks source link

[meta] high IO usage #42

Closed RubenKelevra closed 3 years ago

RubenKelevra commented 3 years ago

the current approach to pin on each update the full folder structure recursive requires IPFS to do quite a lot of IO, as @Luflosi mentioned here: https://github.com/RubenKelevra/pacman.store/issues/39

There's still the other option that I've implemented on version one of this project, which is to pin each file on it's own, as well as each folder non-recursive.

The advantage is a low amount of IO and a much faster processing time of updates on the cluster nodes, as it's easier to be processed by IPFS. But it requires a lot of disk space on the cluster-database since each changed file and each changed folder is an individual transaction. After just some months the database had a file size of 20 GB while there were 400'000 transactions.

Once https://github.com/ipfs/ipfs-cluster/issues/1008 and https://github.com/ipfs/ipfs-cluster/issues/1018 are implemented we could explore this possibility to reduce the IO load again, with one transaction per update, but individually pinned files and folders.

guysv commented 3 years ago

not solving the underlying issue but what about updating the pins only every few half a day or so? that + bumping on security updates seems reasonable to me. I'm also pretty sure that's how often official mirrors update anyway.

RubenKelevra commented 3 years ago

@guysv there's no way to tell what a security update is and what isn't.

Apart from this that is highly discouraged, as you can tell from the wiki page:

Check the status of the Arch mirrors by visiting the Mirror Status page. It is recommended to only use mirrors that are up to date, i.e. not out of sync.

I only run an update when there's an update available - so there's no unnecessary IO.

I think I found the issue why it's pretty high in IO load. It does up to 4 tasks concurrently right now.

This doesn't reduce the IO usage, but the load, since ipfs does less concurrently.

Luflosi commented 3 years ago

Wouldn't it be possible to optimise IPFS itself so that for every node in the merle dag, it caches wether or not all nodes below that node are available locally? I would imagine, that this would make a recursive pin, that only has a small difference to another recursive pin, about as fast as a non-recursive pin. This could massively reduce the amount of IO required to recursively pin something if most of the stuff is already available locally.

FireMasterK commented 3 years ago

Not sure if this helps, but we can see if Arch has anything like this: https://wiki.ubuntu.com/Mirrors/PushMirroring

RubenKelevra commented 3 years ago

@Luflosi well... yeah, there might be potential to optimize IPFS itself - which would be great.

But - there are already several ways to improve the situation which are implemented:

RubenKelevra commented 3 years ago

@FireMasterK wrote:

Not sure if this helps, but we can see if Arch has anything like this: https://wiki.ubuntu.com/Mirrors/PushMirroring

Well, no technical Arch does not. But we use very frequent polling to archive the same thing: Single-digit minute numbers between a package maintainer pushing an update until it has reached the IPFS-pacman-mirror.

But the delay between updates which are available and having them added to the cluster isn't the issue here.

It's more a internal cluster issue, where the changes have to be compared to the old version by each cluster member resulting in very large amounts of IO reading the old data from the disk.

teknomunk commented 3 years ago

Another option besides pinning the root directory and pinning each individual file recursively and the directories non-recursively is to add all the packages in an update to a directory that is not in the normal folder structure but contains the same files and add that to the cluster instead of the individual files. You end up with the same result as the individual file pins, but with fewer cluster entries in every instance except for single updated packages.

RubenKelevra commented 3 years ago

@Luflosi I've pushed the concurrency fix half an hour ago to ipfs. If you restart your follower you should see a drop in IO load.

Please report back. :)

RubenKelevra commented 3 years ago

Another option besides pinning the root directory and pinning each individual file recursively and the directories non-recursively is to add all the packages in an update to a directory that is not in the normal folder structure but contains the same files and add that to the cluster instead of the individual files. You end up with the same result as the individual file pins, but with fewer cluster entries in every instance except for single updated packages.

That's not an option.

You'll end up with a lot of folders with unrelated updates which got pushed together. Say if package a to g got updated, I'll add a, b, c, d, e, f, g to a folder.

Now you got an update of d, f, x and y, so I would have to traverse all folders stored in the cluster to find the folder with d and f, delete those files from the folder, push the new version of the folder and create a new folder with d, f, x and y in it.

Over time you end up with a lot of folders with just a single package in them when they haven't got updated in a while.

Edit:

Just to give you an impression, that would be roughly 5259 folders for /community right now. For just ~8000 packages.

teknomunk commented 3 years ago

That is not really what I was wanting to describe. Let me try again.

The folder structure under /ipns/x86-64.archlinux.pkg.pacman.store/ is not changed at all from its current state. What I suggest is having a completely separate directory structure that contains the same package files with a different structure optimized for making the cluster members pin just the new packages without having to check all the other packages and directories in the repo.

As an example, consider that update with only the packages abiword and go-ipfs. You would create a directory like this:

/2021-01-22-001/ /2021-01-22-001/abiword-3.0.4-4-x86_64.pkg.tar.zst /2021-01-22-001/go-ipfs-0.7.0-1-x86_64.pkg.tar.zst

in addition to updating /extra/ and /community/, then add the hash of the folder /2021-01-22-001/ to the cluster. This folder would exist only in the cluster, just for the purpose of having the cluster members pin those two new packages.

If you then got another set of package updates, you would create another folder for only those packages:

/2021-01-22-002/ /2021-01-22-002/dbus-broker-26-1-x86_64.pkg.tar.zst /2021-01-22-002/fftw-3.3.9-1-x86_64.pkg.tar.zst /2021-01-22-002/xorg-docs-1.7.1-3-any.pkg.tar.zst /2021-01-22-002/yasm-1.3.0-4-x86_64.pkg.tar.zst

After all the packages in a directory have been removed from upstream, after some fixed expiration time, or some other condition, the update directory (i.e. /2021-01-22-001/) is unpinned from the cluster.

Looking at rsync2ipfs-cluster/bin/rsync2cluster.sh, to implement this idea, I think you will only need to modify ipfs_mfs_add_file() to take a third parameter (the update folder path in MFS) along with adding the file's CID to the update folder, and add the update folder to the cluster pin set.

FireMasterK commented 3 years ago

I feel we should explore pin-update too.

Luflosi commented 3 years ago

@RubenKelevra I tried using BadgerDB a couple weeks ago but I think that made performance worse, possibly because my ZFS recordsize is set to 128K, which might be too large for BadgerDB. But I couldn't find any information about the optimal block size for BadgerDB online. Setting the recordsize to a lower value also reduces the possible compression ratio. Setting it to 4k when ashift is 12 effectively disables compression. I was also concerned about fragmentation at such a low recordsize. I also didn't feel like experimenting since my repo has enough blocks that many operations such as the conversion take over a day to complete, so I just went back to an older snapshot with FlatFS. I think you're also using ZFS. What recordsize are you using and is BadgerDB working well for you? I found the Datastore.BloomFilterSize option, which sounds like it has the potential to speed up pinning operations but I couldn't find any documentation on what it actually does. Do you know?

RubenKelevra commented 3 years ago

@Luflosi wrote

I think you're also using ZFS. What recordsize are you using and is BadgerDB working well for you?

I'm currently using the following settings on the import server:

BadgerFS isn't working for me since I need to hold the data outside and inside of IPFS which makes deduplication very reasonable.

As Blocksize I would recommend 8K as for all databases and compression / deduplication turned off.

I would also recommend to turn off sync in ZFS, which might sound counter-intuitive, but ZFS makes sure that override operations and other atomic changes of data are kept atomic. That's why a database would send a sync, so zfs can ignore the command and do its operations more efficiently.

I found the Datastore.BloomFilterSize option, which sounds like it has the potential to speed up pinning operations but I couldn't find any documentation on what it actually does. Do you know?

Well, it's an interesting feature, but you need to tune it depending on the amount of CIDs you store. It replaces a "let's check if we got that data already" with a "let's have a calculation which gives us with 99.xx% certainty the same result". It's definitely faster, but rarely used. That's why I stay away from it in my daily IPFS usage. I don't know what happens when we cannot deliver 0.xx percent of the blocks, because their checksums was too similar, but I guess it wouldn't be pretty.

Moved to https://github.com/RubenKelevra/pacman.store/issues/47

RubenKelevra commented 3 years ago

Hey @teknomunk,

thanks ... I feel like this idea needs a dedicated ticket. Can you move your idea to a new ticket and we discuss there? This one here is more like a meta-ticket and it get's a bit too cluttered with our discussion here :)

RubenKelevra commented 3 years ago

Just to give some impression about how many times the cluster gets updated and how it's spaced here's an example how the cluster pinset looks like:

``` x86-64.archlinux.pkg.pacman.store@2021-01-22T02:27:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T05:42:31+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T07:15:24+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T07:50:30+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T08:18:23+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T08:46:05+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T08:58:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T09:01:40+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T09:20:09+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T09:22:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T10:38:40+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T10:41:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T10:43:59+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T10:49:25+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:06:22+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:08:10+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:10:33+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:13:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:15:12+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:23:56+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:25:55+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:44:53+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:46:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T11:49:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T12:32:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T13:32:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T13:35:00+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T13:58:06+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T14:51:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T14:59:54+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:02:33+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:04:12+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:08:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:19:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:22:25+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:24:31+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:29:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:32:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:35:10+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:37:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:42:30+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:45:50+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:48:37+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:53:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:56:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T15:58:12+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:01:09+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:03:55+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:07:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:11:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:14:56+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:17:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:19:02+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:22:28+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:24:24+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:28:45+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:31:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:35:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:38:52+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:42:27+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T16:45:31+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:16:02+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:24:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:44:28+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:49:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:52:11+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:54:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T17:57:06+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:00:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:04:27+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:08:10+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:11:41+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:15:08+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:20:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:22:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:27:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:30:05+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:33:46+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:36:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:38:41+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:40:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:42:55+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:44:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:46:43+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:49:00+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:51:59+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:54:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T18:57:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:01:55+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:04:50+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:07:45+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:10:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:13:45+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:16:24+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:19:40+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:23:36+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:27:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:30:28+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:33:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:35:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:39:17+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:41:19+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:45:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:49:09+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:53:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T19:57:19+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:00:15+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:02:56+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:05:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:07:44+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:10:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:14:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:22:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:27:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:29:36+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:31:50+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:34:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:36:19+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:38:36+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:41:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:43:22+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:45:28+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:47:55+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:51:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:53:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:55:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:57:45+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T20:59:40+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:01:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:06:14+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:09:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:13:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:16:24+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:18:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:20:30+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:23:27+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:28:08+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:30:26+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:34:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:37:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:41:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:44:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:47:22+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:50:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:54:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:57:15+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T21:59:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:01:41+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:05:48+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:07:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:13:02+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:15:49+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:19:01+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:21:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:25:13+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:27:23+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:30:08+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:33:14+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:37:41+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:40:33+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:43:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:46:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:50:39+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:53:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T22:56:32+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:00:18+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:03:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:07:41+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:10:46+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:13:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:16:09+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:19:14+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:21:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:24:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:27:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:30:11+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:33:15+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:35:37+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:40:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:42:49+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:46:04+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:49:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:52:00+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-22T23:58:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:00:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:03:31+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:06:48+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:09:05+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:12:50+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:15:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:20:24+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:23:48+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:29:42+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:32:16+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:37:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:40:31+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:43:30+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:47:49+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:50:06+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:53:58+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T00:57:20+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:00:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:05:37+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:10:09+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:12:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:16:34+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:21:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:25:35+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:30:17+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:33:49+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:36:51+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:39:53+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:43:14+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:46:45+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:49:21+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:51:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:54:47+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T01:57:06+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T02:09:07+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T02:11:29+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T02:16:57+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T02:19:11+00:00 x86-64.archlinux.pkg.pacman.store@2021-01-23T02:22:50+00:00 ```
Luflosi commented 3 years ago

If you restart your follower you should see a drop in IO load.

Please report back. :)

I restarted my follower a couple days ago and now it finally caught up (pinned everything) but I'm not sure if it's better or worse than before, since I changed quite a lot of things on my end. Since I last ran the follower, I switched Linux distros from Arch Linux to NixOS, put my root FS on the same ZFS pool as everything else, added a SATA SSD as an L2ARC and added the Bloom filter option in the IPFS config. I think I'm currently limited by the 3GB/s SATA interface my SSD is connected to. When I next reboot my Server, I'll attach it somewhere else that might be a 6GB/s interface. I'm running a couple Arch Linux LXD containers, so the cluster still has relevance for me besides just donating bandwidth. Maybe I'll try BadgerFS again at some point but with 4k recordsize and no compression.

RubenKelevra commented 3 years ago

@RubenKelevra wrote

@Luflosi well... yeah, there might be potential to optimize IPFS itself - which would be great.

But - there are already several ways to improve the situation which are implemented:

  • You can use BadgerDB as the block storage, which gives IPFS much more performance than FlatFS - but there's a catch: The cleanup currently doesn't workup very well. See ipfs/go-ds-badger#54

This has been fixed with IPFS 0.8 and IPFS-Cluster 0.13.1.

  • The other option to increase speed of an update is to use GraphSync which should increase the speed of a delta transmission between two states. There are network security concerns - that's why I don't use it yet.

Still not fixed upstream.

  • We could use the pin-update command rather than pin-add for the cluster. I'm not sure why I decided against it some month ago when I wrote version 2, but I think there was a limitation in ipfs-cluster v0.12 which had me use regular pin-add. Not sure how this works out IO wise, thought.

Today I started running the cluster with pin-update. Please report back how the IO is now.

RubenKelevra commented 3 years ago

Since no further feedback has come in, I guess this is resolved.