Redesign upload <-> shard relations

In the initial API design upload was meant to be just a pointer to a content root and shards were meant to be a metadata regarding dependencies. I was not even thinking about them in terms of that's where the content blocks are, but rather in terms of causal dependencies that is this content depends on those things.

We did however end up using them to describe where content blocks are and at this point linking to other shards is likely going to be problematic. Furthermore later we discovered a need to mutate shards so if you create two uploads with same root and different shards they get merged and shards become union of shards across them.

While our uses have evolved, but our DB design has not, specifically we store upload as a single record. When we have upload with the same root we update existing record with new shards.

Problem

Mismatch in the API design, usage and implementation has major drawbacks that I'm going to list out below. We can address those by evolving design to match our uses and adjust implementation also.

Storing shards in the same upload record imposes limit on number of shards you can have. This is not just hypothetical we've seen this issue come up in production preventing uploads from happening.
There is no good way to delete uploads. Our CLI (and soon will client) simply deletes all the shards for the upload and hopes that no other upload references them. If other uploads do refer to those shards they will get corrupted. Worse yet our DB schema prevents us from even telling if there are other uploads that would be affected.
Uploads can only add shards but they can never remove them. This does not seem like a big problem on it's own, but it does prevent us from modeling relations upload <-> shard that can be updated which would be required to address No 2.

Proposal

I think we should consider modeling upload <-> shard relations explicitly as opposed to merging shard sets workaround we use today. To do so I propose following.

Optional: Introduce separate cause: Link[] field to serve originally intended purpose (I'll elaborate why would we even want this in more detail later)
Introduce explicit operation for linking / unlinking shards e.g. through upload/shard/add and upload/shard/remove capabilities. We could still keep optional shards field on upload/add for backwards compatibility and retain current "shards merge" behavior they would simply exercise upload/shard/add behind the scenes.
Update our DB schema so that we have separate table for tracking upload <-> shard many to many relations.
Make upload/remove retract upload <-> shard relations.
Update our upload/remove or perhaps another upload/delete capability that in addition removes shards that have no other relations.

Cause

The reason one may want to have a cause field is to model all kinds of stuff like CRDTs or even simpler relations. Here is a very practical use use case that only involves our APIs - If I want to have two uploads referencing same exact content as separate entries and different links to shards I'll have no way to do it.

The cause field could be used to make one upload different from the other and capture the cause for doing so.

If you are thinking but why would anyone want to have two different uploads with the same root yet different shards consider following. First of all "upload" is only primitive we offer to describe a DAG (perhaps we should consider introducing more) and that means that is only way to model PARTIAL DAGs. If app want to share access to part of the DAG they currently can't without creating some other root and doing additional bookkeeping on their own.

Another take on this would be imagine I have some DAG that I have sharded one way. Later I have decided to shard it another way. Having same upload containing shards from both approaches won't be that great because assembler will end up with more shards than it needs having to discard some. But more generally speaking those are two different view of the DAG and unifying them into one is just a bad idea. Partial DAGs are similarly just different (partial) views of the DAG.

P.S. I could be wrong to suggest catching all the above cases with cause and perhaps some could use different attribute for it, but even so those could be introduce later.

In general I’m kinda down with this, but I don’t think anyone has (yet) sharded differently and then wanted to record another upload for the same root with a different shard set...and I would wager that it's not a common use case.

The issue of not being able to remove a shard can be easily worked around by deleting the upload and then adding a new one.

We have run into a single issue of this in production that was accidental i.e. not intended. If we had actual users wanting to register more shards than we currently can accept I think I'd be inclined to prioritise this more, but until then I'm hesitant.

I think it is reasonable to have a limit on the number of shards. We currently offer thousands. I'm starting to think this is a userland problem that we should be recording in a DAG (saved to a user space) rather than storing this information for users at all!

I would also push more into leaving this as a state that user should manage instead of w3up to incur the costs and be versatile for all use cases. Also not particularly fan of having things without limits, which means I actually am happy with the row limits that make an upload not have unlimited shards out of the box.

Regarding the removes as I mentioned in the linked PR, while feature parity with CLI was goal there, I also raised that honestly, I think we should consider not facilitating a high level function to client as it is not an atomic op, may have mutations on partial failure, and as pointed out when part of multiple uploads may also be an issue. In other words, we can get rid of the complexity of the issue here if we make all mutations granular and as user ops instead of high level functions (what is more important to be super easy to do is an upload)

Generally in support of this plan!

The most important thing it does, imho, is remove the arbitrary maximum upload file size limitation the current design imposes by limiting the number of the shards. I think this could also be mitigated by creating larger shards - afaik there's a maximum size those can be but I believe the current max number of shards + current max shard size gives a maximum upload size that is quite large and probably a reasonable limitation for now.

I am concerned that we have some code that relies on being able to get all of the shards in one query, and I'm a little concerned about how this change would effect that code - this is an unknown-unknown at the moment, so imho we should investigate this and convince ourselves it's not a problem before going too deep on this implementation.

I think larger underlaying problem is there is no simple way for user to track which shards they can remove and which they probably shouldn't. Only way to this for user would be to go through all of the updates and make sure shard isn't used by anything else and only then delete it.

On a separate note we do need to know the shards for the content CID in order to assemble the corresponding DAG, right now that assumes that clients correctly reported the shards to us.

There is also an overlap with content claims and shards here so it may make sense to consolidate.

At the end of the day it boils down to the fact that there is m:n relation between root:shard and there are two separate use cases.

Resolve shards from the root to assemble the DAG.
Resolve roots for the shard to asses whether it is save to delete a shard.

Our current setup

Kind of works for 1st use case and is likely to be superseded by content claims sooner or later.
Works poorly for 2nd use case and can be prone to corruption.

I'd love to push the second use case to the user land, however in that case we probably need to provide some API for mutable sets. In fact an upload/shard/add and upload/shard/remove was kind of that, but perhaps indeed we should just create more primitive API for managing sets e.g.


type RelationUpdate {
  can: "relation/update"
  with: SpaceDID
  nb: {
    path: string, // basically a hierarchical namespace to allow delegations
    node: Link, // could be part of the path but in practice will require parsing so why not just have it separate
    add: Link[] // relation to other nodes been aded
    remove: Link[] // relation to other nodes been removed
  }
} 
// Updates CBOR CID set block keyed by `${node}@${path}`.
// Decodes block as CBOR array as set
// Removes all the links from the set that are in `nb.remove`
// Adds all the links from the set that are in `nb.add`
// Sorts remaining links alphabetically and encodes it as CBOR block
// Writes a CBOR block to new `${node}@${path}`.
// Returns CID of the new CBOR block

type RelationGet {
  can: "relation/get"
  with: SpaceDID
  nb: {
    path: string
    node: Link
  }
}
// Returns CID of the block at `${node}@${path}`.

storacha-network / w3up