ceramicnetwork / js-ceramic

Typescript implementation of the Ceramic protocol
http://ceramic.network
Other
414 stars 127 forks source link

fix: unique metadata is not properly typed as a Uint8Array #3141

Closed dbcfd closed 8 months ago

dbcfd commented 8 months ago

multiformats will fail encoding unless this is specifically a Uint8array

PaulLeCam commented 8 months ago

Seems like this change fixes a symptom rather than the cause of the issue. Why isn't the cloned value a Uint8Array in the first place please?

dbcfd commented 8 months ago

Seems like this change fixes a symptom rather than the cause of the issue. Why isn't the cloned value a Uint8Array in the first place please?

Arrays coming through http client won't be a Uint8array. We would need to adjust the parser to force this to be a uint8array, but that would leak this logic into the http server, which doesn't need to care that it is a uint8array.

PaulLeCam commented 8 months ago

Arrays coming through http client won't be a Uint8array. We would need to adjust the parser to force this to be a uint8array, but that would leak this logic into the http server, which doesn't need to care that it is a uint8array.

That's why we encode the Uint8Arrays to base64 in the client, no? I don't think in any case the Uint8Arrays should be encoded as regular Arrays in the APIs, that would make us lose the information that it's binary data.

stbrody commented 8 months ago

We would need to adjust the parser to force this to be a uint8array, but that would leak this logic into the http server, which doesn't need to care that it is a uint8array.

Isn't the parser the exact right place to be doing type conversions and enforcement? Generally good practice is do all your parsing, validation, and type conversation at the edge of the system, then have something strongly typed and of the structure you want everywhere else inside the bounds of the system.

By "parser" here I'm talking about the codeco codec we use to explicitly parse messages once they come in to the http endpoint. That seems like the right place for this.

dbcfd commented 8 months ago

@PaulLeCam @stbrody It's possible it's being decoded correctly at the http layer, but then "lost" when we put it in stream state.

export interface StreamMetadata { controllers: Array<string> model?: StreamID context?: StreamID family?: string // deprecated schema?: string // deprecated tags?: Array<string> // deprecated forbidControllerChange?: boolean // deprecated, only used by TileDocument [index: string]: any // allow arbitrary properties }

Unique is in the arbitrary properties, so does not retain information that it is a Uint8array. Should unique be a field in StreamMetadata?

stbrody commented 8 months ago

Unique is in the arbitrary properties, so does not retain information that it is a Uint8array. Should unique be a field in StreamMetadata?

Yeah I'd be down with adding unique to StreamMetadata. Honestly I kind of feel like StreamMetadata should not allow arbitrary additional properties, each new field added to metadata should be added to the type explicitly. Tagging in @ukstv in case he disagrees on that

dbcfd commented 8 months ago

Unique is in the arbitrary properties, so does not retain information that it is a Uint8array. Should unique be a field in StreamMetadata?

Yeah I'd be down with adding unique to StreamMetadata. Honestly I kind of feel like StreamMetadata should not allow arbitrary additional properties, each new field added to metadata should be added to the type explicitly. Tagging in @ukstv in case he disagrees on that

Although we can do this, it doesn't fix the problem, since unique can be Uint8Array | string in GenesisHeader. But if we change that, we may still find unique as a Uint8Array since GenesisHeader extends CommitHeader which has [index: string]: any // allow support for future changes.

Additionally, the SET relation functionality specifies a unique with is an array of string joined by | and converted as a utf-8 to a Uint8array, whereas other places convert it to/from base64.

TLDR: This is the easiest fix, whereas other fixes cascade with impacts, possibly on the statestore.

stbrody commented 8 months ago

@dbcfd can you share what made you open this PR in the first place? Are you seeing some behavior in the wild? Is there a repro of the issue you can share?

dbcfd commented 8 months ago

@dbcfd can you share what made you open this PR in the first place? Are you seeing some behavior in the wild? Is there a repro of the issue you can share?

https://github.com/3box/ceramic-http-client-rs/pull/9 includes the ability to create single/deterministic models and model instances. Attempting to do so will fail because of this issue.

dbcfd commented 8 months ago

fixed by https://github.com/ceramicnetwork/js-ceramic/pull/3149 (or possibly another PR)