ipfs / notes

IPFS Collaborative Notebook for Research
MIT License
401 stars 30 forks source link

Feedback from package managers from the Reproducible Builds Summit #366

Open momack2 opened 5 years ago

momack2 commented 5 years ago

Relevant for https://github.com/ipfs/project/issues/15

@hsanjuan and @warpfork attended the Reproducible Builds Summit and fielded lots of interest around adding package binaries to IPFS directly for build pipelines. There was lots of excitement about what IPFS can unlock once some more edges are ironed out - some of the needs and feedback below. We could also improve our explanations around Merkle DAGs, DHT, HTTP Gateway, and architecture/network diagrams to make IPFS more friendly to new users. This writeup is courtesy of @hsanjuan's event summary (so all "I"s are actually Hector).

Feedback and requests

IPNS publishing

A package manager will need to publish root-cids for the current version of the mirror (frequently updated). Enabling ipns-pubsub is an experimental feature (also, some experimental features are enabled in the config, others in the command line, which is weird). It's an awesome workaround that make things work since a while ago, but it's still experimental. We should try declare these things stable and enable them by default with a better cadence. QUIC and other efforts will play a hand in improving this and that outlook is good in the short term to get IPNS working properly, we just need to keep in mind that it's essential. Might be useful to follow up with any package mangers that have tried out IPFS so far once this is more stable so they can take another look.

Improve docs about adding stuff to ipfs via API

It is hard to work with the IPFS HTTP API to add and patch things. It is not documented well enough (particularly how to create payloads to /add). Errors (for example due to malformed multiparts) come like a 500 without any useful error message. I was told that submitting the same binary payload resulted in different hashes (Connection: Close missing from request perhaps? Only some devs know about this requirement - need to follow up to see if adding it fixed it or it was something else).

Importers Improvements

We have a tar importer which is still part of go-ipfs code, and hardcodes rabin chunking. It's a nice proof of concept, but it does not offer progress while adding etc. Other than that we don't support adding intelligently any other formats: jar, wheels and other based-on-zip things. There were folks at the summit interested in working on that! Being able to import multiple package formats unlocks major deduplication possibilities when storing pacakges on IPFS, which is a very attractive feature for package manager (some package manager repositories are >150TB, though may go down a lot if deduplicated properly?) We should enable contributors to implement importers (perhaps as plugins?) and load them. At least, we should probably extract "go-unixfs/importers" from go-unixfs (I think it's there because of deps) and the "go-ipfs/tar" thing (the "tar" importer should get first class support, probably integrated in /add and not as a separate endpoint).

Current projects

Request for better explanation around block-size and chunking

Stebalien commented 5 years ago

Great feedback!

I was told that submitting the same binary payload resulted in different hashes (Connection: Close missing from request perhaps? Only some devs know about this requirement - need to follow up to see if adding it fixed it or it was something else).

This has been fixed in master. The underlying issue was https://github.com/golang/go/issues/15527.

hsanjuan commented 5 years ago

This has been fixed in master.

By setting Connection: close in the response and hoping the client does the right thing right?

I think problems went away when they added it to the request. Chances are some clients don't handle this automatically.

Stebalien commented 5 years ago

There were two fixes: set Connection: close in either the request or the response. However, neither of these rely on the client doing the right thing. Instead, once Connection: close is set on the server, go will do the right thing and won't prematurely throw away the rest of the body.

The downside is that each call to add will require a new connection.