oduwsdl / ipwb

InterPlanetary Wayback: A distributed and persistent archive replay system using IPFS
MIT License
615 stars 39 forks source link

Consider automatic parameterized archive replication #60

Open machawk1 opened 7 years ago

machawk1 commented 7 years ago

Thinking "aloud" here.

ipfs add, invoked via ipwb through the ipfsapi module, does not replicate content onto other machine but simply hashes and adds it to be readily replicated if the hash is fetched from another machine.

Could we put forth a "push" model that would allow users to easily replicate the contents available in their local ipfs instance via ipwb through a command like ipwb replicate myIndex.cdxj that would push the index into IPFS and provide other users a IPFS hash to do a bulk fetch, e.g., ipfs replicate <hash> would fetch the CDXJ and begin to pull the WARC-parts and aggregate the newly acquired holdings with the currently existing CDXJ on the second user's machine.

Parameterization might allow the second user's replicate command to specify which WARCs to pull locally, based on size, encryption, limited to certain domains/tlds <cough @anjackson>, etc.

Thoughts, @ibnesayeed ?

machawk1 commented 7 years ago

Related #61

ibnesayeed commented 7 years ago

Here are some relevant references.

machawk1 commented 7 years ago

Also https://github.com/ipfs/ipfs-cluster