vacp2p / mvds

Minimal Viable Data Sync Implementation
https://specs.vac.dev/specs/mvds.html
MIT License
13 stars 3 forks source link

Updating the numeric ids of fields in Payload protobuffer causes status-go to incorrectly determine if a message is a valid datasync message or not #89

Closed richard-ramos closed 3 years ago

richard-ramos commented 3 years ago

This commit is causing status-go to fail at correctly decoding a message

Notice that both ApplicationMetadataMessage and Payload first two fields have the same data type bytes and field id numbers now:

// status-go/protocol/protobuf/application_metadata_message.proto
message ApplicationMetadataMessage {
  bytes signature = 1;
  bytes payload = 2;
  Type type = 3;
  ....
}

// mvds/protobuf/sync.proto
message Payload {
  repeated bytes acks = 1;
  repeated bytes offers = 2;
  repeated bytes requests = 3;
  repeated Message messages = 4;
}

In status-go, the code will attempt to determine if this bytearray is a datasync message or not in here . The bytearray can be either an ApplicationMetadataMessage or a Payload. If the bytearray is an ApplicationMetadataMessage the decoding will fail due to the similarity between both protobuffers. For example, with the following marshalled ApplicationMetadataMessage:

[10 65 123 150 176 120 3 88 96 220 49 0 175 237 144 156 230 192 216 19 36 2 210 146 63 210 36 153 97 141 181 161 155 55 68 12 81 36 117 213 4 195 28 152 5 94 71 0 122 97 51 137 177 105 42 96 75 147 123 1 177 255 13 48 105 22 0 18 42 8 137 171 216 139 160 147 35 16 178 169 158 133 255 44 26 6 97 98 99 49 50 51 50 13 116 101 115 116 105 110 103 45 97 100 97 109 98 56 2 64 1 24 1]

unwrap will not fail, succesfully unmarshalling a protobuffer containing invalid data:

acks:"{\x96\xb0x\x03X`\xdc1\x00\xaf퐜\xe6\xc0\xd8\x13$\x02Ғ?\xd2$\x99a\x8d\xb5\xa1\x9b7D\x0cQ$u\xd5\x04\xc3\x1c\x98\x05^G\x00za3\x89\xb1i*`K\x93{\x01\xb1\xff\r0i\x16\x00"  offers:"\x08\x89\xab؋\xa0\x93#\x10\xb2\xa9\x9e\x85\xff,\x1a\x06abc1232\rtesting-adamb8\x02@\x01"  3:1

Then, when IsValid() is executed, it will return true, mainly because the following code assumes that an unmarshalled Payload containing data is valid. This is an incorrect assumption to make since it is not possible to correctly distinguish the field types that compose a protobuffer while it is being unmarshalled. I'm not sure how to fix this, though https://github.com/vacp2p/mvds/blob/d6b27d3843ffae66806cc969596af342c6e7a839/protobuf/payload.go#L5-L7

It is not possible then to use this version of mvds in status-go for the time being unless we either change the id numbers in mvds, or change the logic in status-go to determine if some byte array is a datasync message or a normal message.

oskarth commented 3 years ago

Oh man, that's a long time ago. I don't have a strong opinion on this, pragmatically this is up to status-go as that's the only implementation using it. It isn't clear to me if the latest version is used there or not?

As for solution, whatever is simplest to get things to work in status-go seems fine by me. We can update the specs if that's easier.

cc @staheri14 as new editor for MVDS FYI

cammellos commented 3 years ago

@richard-ramos unfortunately there's no easy way to distinguish two protobuf messages as the protocol is not self-describing and information about the type of message is not transmitted over the wire, so in that way is ambiguous, you need to know which protobuf you are receiving, or you would have to fingerprint the message based on the fields, as we do. I think the commit above will need to be reverted, as status-go is using the old numbering, which allow distinguishing between the two messages, and in any case it would be a pain to make it backward compatible between clients and improve the fingerprinting, to my knowledge there's no production code using the new numbering, so there's should not be any impact.