ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

Simplified Graphsync protocol #92

Closed vmx closed 5 years ago

vmx commented 5 years ago

I thought we agreed on making the Graphsync protocol as simple as possible, with a way to extend it for future needs.

The current proposal that are floating around are not what I had in mind. For me "simple protocol" means, you send a selector, you get several blocks back.

The request would be similar to what we have (I'd remove the priority, but that's a minor thing). But the response would just be:

message Response {
  int32 id = 1;     // the request id
  int32 status = 2; // a status code.
  enum Kind {
    Block = 0;
  }
  Kind kind = 3; // What kind of message it is
  bytes data = 4;
}

If the kind is Block, the data field would be a Block containing the CID and the actual data. I think that's all we would need for a first version of the protocol.

It's future proof as we could just add another enum field. For example if we want to return multiple responses with several blocks (what is part of the current proposal). Just put all that data into the data field and add a new kind. You only need to make sure that the id is set to one of the requests that is part of the data field.

hannahhoward commented 5 years ago

This is similar to what I have in code right now (except I still have the extra block from the original proposal and no kind value)

Even though I am the implementer of this in Go, I mostly have no dog in this fight. I mainly want to settle on something that satisfies:

  1. @Stebalien 's desire for block de-duplication, assuming that is still important.
  2. @warpfork has enough information to do the verification

Beyond that, I support whatever folks want and can adjust my implementation as neccessary.

For reference, here is what my actual protobuf code looks like ATM:

message Message {

  message Request {
    int32 id = 1;       // unique id set on the requester side
    bytes root = 2;     // ipld root node for selector
    bytes selector = 3; // ipld selector to retrieve
    bytes extra = 4;    // aux information. useful for other protocols
    int32 priority = 5; // the priority (normalized). default to 1
    bool  cancel = 6;   // whether this cancels a request
  }

  message Response {
    int32 id = 1;     // the request id
    int32 status = 2; // a status code.
    bytes data = 3; // core response data
    bytes extra = 4; // additional data
  }

  // the actual data included in this message
  repeated Request reqlist = 1 [(gogoproto.nullable) = false];
  repeated Response reslist = 2 [(gogoproto.nullable) = false];
}
whyrusleeping commented 5 years ago

@hannahhoward That looks good, though I would say that the 'root' and 'selector' should probably be just one field, and you will want a 'cid prefix' field in the response, so that we know how to hash and validate the data coming back.

vmx commented 5 years ago

The reason why I would postpone the "multiple requests in one message" is to keep things simple.

My hope is that you wouldn't actually need the request and response ID when coding this, it would be abstracted away by using libp2p (though I don't know if libp2p supports something like this, nor if lib2p is meant to abstract such things away).

The code for handling responses (kind of the acceptor of responses) gains a lot of complexity if a response may contain the data of several requests. Something that might make sense as an optimization. But at the current stage of things I'm not even sure this optimization is practically worth it.

momack2 commented 5 years ago

My understanding from our conversation last week was that the wire protocol will use one field for the root and selector, and then on the selectors side there will be two messages where one breaks apart the root and path and then calls the recursive function to walk through the path with the root as a separate input. Is that vaguely correct @warpfork?

Seems like the wire protocol can support multiple responses in one message even if the validation step inside selectors only supports responses for the same request id in v1, no? That allows us to add in support without changing the interface/wire protocol later.

vmx commented 5 years ago

That allows us to add in support without changing the interface/wire protocol later.

We can also add support for multiple responses in the same message later on with the proposal outlined in this issue. We don't lose anything, but we gain simplicity for the case we need right now (as @hannahhoward mentions, that's even what she has already, so why spending more time on something that we don't need in the near future).

vmx commented 5 years ago

The distilled version of my vision can be read at https://github.com/ipld/specs/issues/101. Hence closing this issue.