Open Gozala opened 1 year ago
I'm looking for something that allows updating or pushing more pieces to a larger CAR file so I don't have to push entire replicas after every update.
what’s the motivation for this ? Specifically why do you need all of the pieces to be in the same car ?
mutating CARs is not something we’re considering. However we do have few things that may address your needs differently.
store/add
now has optional origin
field that you can point to a previous shard of the DAG. So basically a CAR with preceding pieces.
upload/add
operation allows you to publish DAG root with optional shards
pointing to all the CAR’s it’s block are in.With above two I suspect you may have all the things you need, but please follow up so we can ensure.
It would be nice to be able to request just the shards as well but probably not going to happen which isn't a problem for now since I can use ipfs with web3.storage. At that point its almost a jank version of pinning and pulling IPLD DAGs.
CARs have CIDs and are stored as is in our system. Technically speaking it should be no problem at all to serve those CARsby their CID from our gateway, that said I’m not entirely sure if it works today. If it doesn’t please create an issue for that and I’m sure we’ll be able accommodate.
what’s the motivation for this ? Specifically why do you need all of the pieces to be in the same car ?
Your right, they dont need to be in the same car. I just need a car library that allows adding incomplete DAGs; which any car library should be able to make since its not required as part of the spec. When using the js-car library this was an issue a few months ago.
That is the main piece needed and then everything could be pulled from ipfs which should work well. :+1:
I didn't know what 'CAR sharding' was but sounded like something I might have been able to use.
Your right, they dont need to be in the same car. I just need a car library that allows adding incomplete DAGs; which any car library should be able to make since its not required as part of the spec. When using the js-car library this was an issue a few months ago.
Our client uses js-car and more specficially CARBufferWriter
which we have added so we could allocate CAR of certain size, pack it with blocks and send it to the web3.storage.
This seems to work really well with our @ipld/unixfs which we use to turn files and dirs into DAGs, because it spits out blocks as soon as they are ready. We put them into preallocated CAR once it's full, we send it off and continue with another CAR shard and so on until no more blocks are left.
Each CAR packet links to previous CAR packet and once all are uploaded, client sends upload/add
with links to all the CARs.
I didn't know what 'CAR sharding' was but sounded like something I might have been able to use.
We just call shards a partial fs DAGs which are currently encoded as CAR files. We call them shards as opposed to just CARs because we may have different representations in the future and semantically it's a shard of the DAG.
In term of how this would fit opal/orbitdb, I have limited context so I not all of it may make sense.
store/add
that changeset.origin
field, but again that's not required.upload/add
request with root
pointing to the DAG root CID and shards
pointing to all the "changesets" it deems relevant.
upload/add
isn't required at all it's just what user will see in the upload list and maybe it's irrelevant for your use case. All the CIDs inside uploaded CARs will still remain available.Please note that opal/orbitdb
could incorporate CAR cids in it's data structure so it could get "changesets" in single roundtrip, or it could completely ignore that and get each blocks by it's CID regardless which account / CAR it is in.
Also note that ☝️ says including only local changesets in the CAR with an assumbtion that remove changes are stored by their authors. That said it's certainly possible to include those changes to ensure that you have a copy even if author deletes them.
Finally if your protocol / data structure uses CAR CIDs you could store those CARs into your account and without re-uploading it if we have that CAR already, because we just add it to your account and bill you accordingly.
This sounds good :+1:
Please note that opal/orbitdb could incorporate CAR cids in it's data structure
I'll have to think about this one some more. I may not incorporate CAR cids in the base data structure but the replicator might try to do something like this.
Also note that point_up says including only local changesets in the CAR with an assumbtion that remote changes are stored by their authors. That said it's certainly possible to include those changes to ensure that you have a copy even if author deletes them.
remote and local changes would be included, anything that has been added to the local replica should be available under the IPNS record for that peer>database.
I'll share anything related I make here. Hopefully won't be long...
remote and local changes would be included, anything that has been added to the local replica should be available under the IPNS record for that peer>database.
Well even if don't include remote changes in the CAR your DAG will still link to them so publishing root to IPNS technically includes those changes. That said, those blocks may or may not be reachable if you don't save it in your account so including might be a better choice.
Right, I meant to say that they would [need to] be included in the CAR for that reason.
Creating an issue based on offline interaction with Daniel who is trying to build OrbitDB like sync protocol on top of web3.storage
Quoting key points so we can continue discussion in the open