IPFS HTTP seeding - Githubissues

wscott commented 10 years ago

It would be very nice to provide a braindead simple way to publish files to IPFS without have to run a IPFS node on your own machine.

It seems to be you could have a support for the ability to publish objects to a normal http files server and then the IPFS client can fetch them if the DHT lookup for that object fails.

Something like this:

$ cp -r ~/mystuff ~/Dropbox/Public/mystuff $ ipfs-http-seed-setup ~/Dropbox/Public/mystuff ZyPo5pBn5nCg3bzZ16lJyAZKc1Y

The script creates a DAG for the data, writing the objects to separate files in a .ipfs subdirectory. The blob (or block? the paper is inconsistent) objects could be hardlinked to the actual files. Then the script returns the hash for the toplevel commit object in the tree. The code can also work incrementally to update .ipfs when the files change.

Next the user publishes this information in DNS

ipfs.example.com TXT "ipfs-http=https://dl.dropboxusercontent.com/u/4893/mystuff ZyPo5pBn5nCg3bzZ16lJyAZKc1Y"

So then:

/ipns/ipfs.example.com will link to /ipns/ZyPo5pBn5nCg3bzZ16lJyAZKc1Y

but the client has a backdoor resolution for missing objects where it can try fetching them from http. It sort of become an automatic CDN. A nice extension would be to separate the publishing of the metadata from the actual content. Then you could export someone elses data into IPFS.

jbenet commented 10 years ago

Yeah totally, this would be very useful. It would be even sweeter to have a fuse-mounted fs that did it automatically periodically (every write would be bad performance wise-- though we can try it for local stuff, auto-squashing commits until some periodic interval to publish to the world)

Btw, DNS doesn't need to be updated on every hit when using ipns:

Initial setup:

create pk/sk pair -- with H(-) being "hash of -"
setup DNS record foo.com TXT "ipns=<H(pk)>" (as in S 3.7.2)
ipfs mount-box <H(pk)> ~/ipfsbox (or something similar)

(i should have a markdown version of the paper here, so I can link to it-- silly pdf...)

Then on every write to ~/ipfsbox/<path>:

generate new merkle dag (using block splitting -- rabin fingerprints, etc -- to only update differing blocks)
commit (squash/amend if haven't re-published)
write all new objects to local ipfs storage

Then, on every re-publish (periodic timeout, manually issued, etc):

publish last commit to /ipns/<H(pk)> (on dht, no need to update DNS)
anyone can access /ipns/<H(pk)> (or equivalent /ipns/foo.com)

Yep, CDN is one of the main use cases for ipfs. And yes, for HTTP fallback, i plan to be running a service at http://ipfs.io that resolves all paths + serves objects at http://ipfs.io/<path>. We can set this up as an anycast system, letting people host their own servers to help with the load. (anyone can always verify the hashes, so np)

verokarhu commented 10 years ago

Yep, CDN is one of the main use cases for ipfs. And yes, for HTTP fallback, i plan to be running a service at http://ipfs.io that resolves all paths + serves objects at http://ipfs.io/. We can set this up as an anycast system, letting people host their own servers to help with the load. (anyone can always verify the hashes, so np)

Correct me if I'm wrong but for a client using http://ipfs.io/ they would first have to receive the entire block before being able to verify the hash?

That sounds problematic, since if hostile ipfs nodes send the client garbage then it has to start the process all over again with some other node. Perhaps ipfs.io should anycast to a known list of good nodes instead?

jbenet commented 10 years ago

Yeah for these gateways known good nodes. anycasted nodes will be specified in DNS so it won't be a very fluid process. If those gateways are malicious, they can be detected and removed quickly.

Also worth noting this problem depends on block size. (Large files are split into chunks as in LBFS, etc). — Sent from Mailbox

On Wed, Aug 6, 2014 at 11:08 AM, Andreas Metsälä notifications@github.com wrote:

Yep, CDN is one of the main use cases for ipfs. And yes, for HTTP fallback, i plan to be running a service at http://ipfs.io that resolves all paths + serves objects at http://ipfs.io/. We can set this up as an anycast system, letting people host their own servers to help with the load. (anyone can always verify the hashes, so np) Correct me if I'm wrong but for a client using http://ipfs.io/ they would first have to receive the entire file before being able to verify the hash?

That sounds problematic, since if hostile ipfs nodes send the client garbage then it has to start the process all over again with some other node. Perhaps ipfs.io should anycast to a known list of good nodes instead?

Reply to this email directly or view it on GitHub: https://github.com/jbenet/ipfs/issues/17#issuecomment-51373458

daviddias commented 5 years ago

Related recent discussion on the IPLD project - https://github.com/ipld/ipld/issues/57

ipfs / notes

IPFS HTTP seeding #327

That sounds problematic, since if hostile ipfs nodes send the client garbage then it has to start the process all over again with some other node. Perhaps ipfs.io should anycast to a known list of good nodes instead?