ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
16.2k stars 3.03k forks source link

Support pebble #10347

Open lidel opened 9 months ago

lidel commented 9 months ago

TODO

Description

Summary

Include pebble as built-in plugin. It provides meaningful alternative to leveldb as the datastore, and may be better than badger1 as well.

Background

It is 2024, and we still only have flatfs, leveldb and necromancy-level badgerv1 (!) as datastore options.

We got positive feedback about pebble, some examples:

https://ipfscluster.io/documentation/guides/datastore/

Pebble is a high performant backend from Cochroachdb, used by default in Cluster:

  • Proven to work well on very large pinsets.
  • Best disk-usage compared to the rest. No need to trigger GC cycles for space reclaim.
  • Performance and memory usage seems on par with Badger3, and behaves better than Badger on both counts.
  • Behaves correctly with default settings but we bump them up a bit.
  • 0-delay startup times, even with very large amounts of data.
  • Options support compression (we chose to leave it enabled by default).
  • The Pebble project is officially alive and maintained.
  • Pebble only runs on 64-bit architectures.
  • One key difference with Badger3 is that Pebble stores keys and values together and any lookup for a key will also read the values, while Badger3 can store keys and values separately (i.e. keys only in the index, which can be loaded onto memory when small enough).

https://github.com/ipfs/go-ds-pebble/issues/29:

After changing leveldb store to pebble store, the speed of GC has increased by at least dozens of times. So it's not flatfs that's to blame for slow GC, it's leveldb. I also tried leveldb and pebble as blockstore, but the CPU and memory usage is unacceptable.

Right now, to use go-ds-pebble one needs to build external plugin.

Proposed change

Include https://github.com/ipfs/go-ds-pebble in standard kubo build, just like we do with legacy badger1.

This will

guojidan commented 8 months ago

hi, I want try implement this feature 😄

lidel commented 5 months ago

Triage note:

lidel commented 4 months ago

Triage notes:

hsanjuan commented 2 months ago
  • we need to figure out which pebble settings are worthy exposing via config, prior art in ipfs-cluster will likely be a good starting point

Expose as much as possible because chances are defaults are never right for everyone.

gammazero commented 2 months ago

@hsanjuan Here is the subset of pebble options that I decided to make configurable.

This decision is based on what CockroachDB configures as pebble defaults, and based on what pebble tuning parameters were useful for optimizing IPNI's use of pebble.

gammazero commented 2 months ago

@lidel I want to stop using ipfs-ds-convert. There should instead be a way to export all data from kubo instance and import it into another. This would free us from ever having to write a conversion for another datastore.

Maybe it could work something like this:


oldipfs export --repo-dir=~/.ipfs_prev --json --stdout | ipfs import --json
lidel commented 2 months ago

@gammazero replacing ipfs-ds-convert with ipfs repo export|import sounds very sensible, will be useful for people running custom datastores as well.

FYSA filled https://github.com/ipfs/ipfs-ds-convert/issues/50 to track sunsetting ipfs-ds-convert (i'll do it shortly, taking it off our plate)

hsanjuan commented 3 weeks ago

@hsanjuan Here is the subset of pebble options that I decided to make configurable.

This decision is based on what CockroachDB configures as pebble defaults, and based on what pebble tuning parameters were useful for optimizing IPNI's use of pebble.

I had a quick look. Pebble defaults are quite conservative but should be ok for home use. I had issues with FormatMajorVersion errors during pebble upgrades: https://github.com/ipfs-cluster/ipfs-cluster/pull/2019/files . When not explicit, the current default is used. Sometimes ratcheting the database after an upgrade fails if you jumped too many pebble releases in-between.

lidel commented 2 days ago

Added FormatMajorVersion to the TODO at the top of the issue + reopening to track next steps.