Closed Stebalien closed 2 years ago
(whether if this is a P0 for ntwk v16 is pending analysis.
To be continued tmr.
An alternative is to just migrate labels to utf8.
I would entirely discard this option. First, from a product perspective it makes adding binary metadata tricky. Given that Go accepts non-UTF-8 strings, and it's possible that those exist in state, we don't have an easy non-UTF-8 => UTF-8 migration path. We'd have to apply a transform which would manipulate the user-provided value, and that is definitely a no-go.
I think I've read all the past discussion across all links. I have one basic question. A byte array is a looser type than a string. Any string should be representable as a byte array. I'm assuming the bulk of the migration is to re-encode this type from a CBOR string (tags 2 and 3) to a byte slice (different tag). What exactly makes this migration difficult, and what's the reason to consider stopping the market temporarily?
@raulk exactly like you said! and that way we will not need to stop the market temporarily
@jennijuju Can someone please describe "the workable solution"?
@raulk
What exactly makes this migration difficult, and what's the reason to consider stopping the market temporarily?
The migration is trivial. The issue is that naively changing the Label
field in the DealProposal
from a string to bytes changes the serialization of the DealProposal
. The current flow is as follows (some aspects of this flow aren't protocol-specific, but might as well be because they reflect the reality of dealmaking on Filecoin today):
1) Client creates a DealProposal
, signs it, wraps the proposal and signature into a ClientDealProposal
, and sends that to the SP
2) SP deserializes the ClientDealProposal
, validates the signature, adds it to a batch, and eventually PublishStorageDeals
(sending the serialized ClientDealProposal
to the chain)
3) Validators of the Filecoin blockchain:
ClientDealProposal
into a DealProposal
and a signatureDealProposal
DealProposal
The issue is that all of the de/serializaton steps fail if they were done on opposite sides of the switch from string to bytes. When should clients start using bytes instead of strings? When should SPs start expecting clients to be using bytes instead of strings? What happens to all messages in the mempool that signed strings instead of bytes after the network upgrades?
Can someone please describe "the workable solution"?
The solution proposed by @Stebalien is to use a Union type, something like
struct Label {
label_string string
label_bytes []byte
}
We'd assert that exactly one of label_string
and label_bytes
is set, and de/serialize this object as though it were just one field -- "If label_string is set, serialize it as a string, if not, serialize it as bytes" (and the reverse for decoding). This is of course not the default CBOR encoding of such a struct, but that's fine.
Additionaly, we can start asserting that if label_string
is set, that it be valid UTF-8. Other users can simply set label_bytes
instead. The migration can be to copy all existing Label
s into label_string
iff they are valid UTF-8, and into label_bytes
otherwise.
This solves most of the problems described above. A large refactor may still be needed for client implementations if they don't abstract over possible DealProposal
, but at least there's no messy period over the upgrade where large swathes of PublishStorageDeals
messages are failing on-chain and large numbers of clients are seeing SPs reject their deal proposals as invalid.
Thanks for the detailed proposal. This would work if the union type is entirely transparent, that is, if it's not tagged in itself, and instead we use the CBOR tag in the encoded tuple position to discriminate which type is present. Is that what's intended? Does our DAG-CBOR library support such a serde pattern? Does Rust's serde (which we need to cooperate with)?
@Stebalien would be the one to confirm, but I'm guessing it's supported in both places, since we have such types already in the HAMT used by Filecoin (see, for example, here).
Closed by FIP 0027
See https://github.com/filecoin-project/FIPs/issues/187 and https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0027.md.
At the moment, we reject non-utf8 labels. But we need to make a decision before we ship.