NixIPFS / notes

ideas and planning on how to use IPFS together with Nix/OS
17 stars 1 forks source link

More performant alternative to achieve distributed binary caches #2

Open rht opened 7 years ago

rht commented 7 years ago

To achieve the same end of distributing the NixOS binary caches (making this a self-contained task to tackle, decoupled from the task of deduplicating the content in NAR files for reduced transport bandwidth), for now, I'm suggesting the possibility of using instead a good ol' battle-tested infrastructure that we all already know: torrent + Mainline DHT or a secured server that lists the torrent files.

I don't think I have to justify this choice by making a list of content/datasets already being distributed with torrent, but for completeness sake:

300 TB of data still in a http/ftp server:

Until the perf shortcomings of ipfs are fixed (independently tracked in https://github.com/rht/sfpi-benchmark (doesn't matter if it is not endorsed by a specific party), and who knows how/when ipld, filestore, cluster are going to be integrated), I think it'd be more prudent to use torrent to engineer production-grade (vs proof-of-concept) CDN infrastructure (can be quantified by a reduction on the operational cost, or an increase in scalability on the nixops side), but test several emerging distributed p2p content distribution / computational systems for research.

I hope I am not (unintentionally) criticising a specific closed-group-for-profit party for one more time. I have fact-checked the rules of our host, i.e. github, in http://todogroup.org/opencodeofconduct/ to be sufficiently liberal and understanding (whether the web should be decentralized is beside the point), in particular, the line

Our open source community prioritizes marginalized people’s safety over privileged people’s comfort

(marginalized people might refer to scholars/researchers without a strong foothold)

mguentner commented 7 years ago

@rht, thanks for starting this thread. I fully agree with you, no worries :smiley: IPFS has a lot of shortcomings and my goal was to test it against real data to see whether "it's already there". It works, but I ran into a lot of issues ranging from slow transfer speeds and unsolved routing issues to weird rules on how you have to add content. You documented most of it in the sfpi-benchmark repo, thanks for that :+1: . I just hope that the project gets everything sorted out before there is too much technical debt...

I actually wanted to sum it up within the next days or so to be transparent on what IPFS can currently deliver and also suggest torrent as an alternative for the NAR distribution while monitoring the IPFS development.

The scripts that build the current IPFS repository are very modular, have a look: https://github.com/NixIPFS/nixipfs-scripts/blob/master/nixipfs/release_nixos#L66 Instead of this line you can insert a method (-> libtorrent) that creates a torrent. From there the torrent needs to be added to a torrent application and the other nodes (some goes for torrent: I think we need a few permanent seeders that form the core infrastructure) need to be notified. :arrow_right: see https://github.com/NixIPFS/infrastructure/issues/2

However, IPFS might be on of the few solutions that can achieve the goals pointed out in #1 - each build input will result in a single torrent (so a lot of torrents!) - it's not easy to use torrent as a per-file / per-directory caching layer like IPFS is promising to do. Also it might be acceptable to run IPFS on build nodes used for CI in contrast to your everyday notebook.