Emurgo / cardano-serialization-lib

This is a library, written in Rust, for serialization & deserialization of data structures used in Cardano's Haskell implementation of Alonzo along with useful utility functions.
Other
234 stars 125 forks source link

Datum hash changed after serialization roundtrip #313

Closed brunjlar closed 2 years ago

brunjlar commented 2 years ago

I have the following transaction, created with the Cardano-API:

ShelleyTx ShelleyBasedEraAlonzo (ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxIn (TxId {_unTxId = SafeHash "259cf705d9a347bea54d3fc1853aea5ea0f6e48e6ca89f809585598a0d606d08"}) 1,TxIn (TxId {_unTxId = SafeHash "8c8572a808f452f1c17217670c2e3bfa42e6fa6a45145211f86aeeb1c298a374"}) 1,TxIn (TxId {_unTxId = SafeHash "a9d502de07fb2d8e99e1948a17a4909289dc0ec5fd49cac219f449612f9b8474"}) 0], _collateral = fromList [], _outputs = StrictSeq {fromStrict = fromList [(Addr Testnet (KeyHashObj (KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d")) (StakeRefBase (KeyHashObj (KeyHash "1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616"))),Value 884852843 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d")) (StakeRefBase (KeyHashObj (KeyHash "1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616"))),Value 1930992 (fromList [(PolicyID {policyID = ScriptHash "0ce17ede312dcd14bdca75cf3e46b96f5dc6e24ebf7aa41e8b2e685e"},fromList [("TTT",123456)]),(PolicyID {policyID = ScriptHash "2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7"},fromList [("Gold",999999900)]),(PolicyID {policyID = ScriptHash "42d75b0432a8a9d99ca395e054690be425684f7fef58c35e005a5b78"},fromList [("Test",123456789)]),(PolicyID {policyID = ScriptHash "b32f5693ea55500f7bd41045b52f08d2ae7f2fcb2d153584ab576db2"},fromList [("xxxxxx",999)])]),SNothing),(Addr Testnet (ScriptHashObj (ScriptHash "a653f77bfca4acf94cacfdd6d59d6965f57f3e2cc91136830b6ea8f5")) StakeRefNull,Value 1689618 (fromList [(PolicyID {policyID = ScriptHash "2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7"},fromList [("Gold",100)])]),SJust (SafeHash "fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35"))]}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 201537, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d"], _mint = Value 0 (fromList []), _scriptIntegrityHash = SJust (SafeHash "6d0359d7143d033d222250b8be376dec73a9515aa2fde7ca452661aad6e37064"), _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [WitVKey' {wvkKey' = VKey (VerKeyEd25519DSIGN "0717bc56ed4897c3dde0690e3d9ce61e28a55f520fde454f6b5b61305b193605"), wvkSig' = SignedDSIGN (SigEd25519DSIGN "3140493032ae5fe3c4eaa912e6f9e4ec1f1f18fa1f4df30db6221d0ed02a71f120667295f208cc714c56128713afb48721015ce8856580315594b79052e89202"), wvkKeyHash = KeyHash "e99ac432ee241d8d7311b17be68743a19dc7a6c33f8a80e44b664c4f", wvkBytes = "\130X \a\ETB\188V\237H\151\195\221\224i\SO=\156\230\RS(\165_R\SI\222EOk[a0[\EM6\ENQX@1@I02\174_\227\196\234\169\DC2\230\249\228\236\US\US\CAN\250\USM\243\r\182\"\GS\SO\208*q\241 fr\149\242\b\204qLV\DC2\135\DC3\175\180\135!\SOH\\\232\133e\128\&1U\148\183\144R\232\146\STX"}], _txwitsBoot = fromList [], _txscripts = fromList [], _txdats = TxDatsRaw (fromList [(SafeHash "fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35",DataConstr Constr 0 [B "\225\203\184\r\184\158)\"i\174\185>\193^\185c\221\165\ETBkf\148\159\225\194\166\163\141",Map [(B "",Map [(B "",I 100)])]])]), _txrdmrs = RedeemersRaw (fromList [])}, isValid = IsValid True, auxiliaryData = SNothing})

This transaction has one output to a script address with a datum hash of fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35.

The corresponding datum is included in the txdats field with this very hash.

Using the Cardano-API, serializing this to CBOR gives: 84a60083825820259cf705d9a347bea54d3fc1853aea5ea0f6e48e6ca89f809585598a0d606d08018258208c8572a808f452f1c17217670c2e3bfa42e6fa6a45145211f86aeeb1c298a37401825820a9d502de07fb2d8e99e1948a17a4909289dc0ec5fd49cac219f449612f9b8474000d80018382583900e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f6161a34bdc86b82583900e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616821a001d76f0a4581c0ce17ede312dcd14bdca75cf3e46b96f5dc6e24ebf7aa41e8b2e685ea1435454541a0001e240581c2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7a144476f6c641a3b9ac99c581c42d75b0432a8a9d99ca395e054690be425684f7fef58c35e005a5b78a144546573741a075bcd15581cb32f5693ea55500f7bd41045b52f08d2ae7f2fcb2d153584ab576db2a1467878787878781903e783581d70a653f77bfca4acf94cacfdd6d59d6965f57f3e2cc91136830b6ea8f5821a0019c812a1581c2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7a144476f6c6418645820fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35021a000313410e81581ce1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d0b58206d0359d7143d033d222250b8be376dec73a9515aa2fde7ca452661aad6e37064a200818258200717bc56ed4897c3dde0690e3d9ce61e28a55f520fde454f6b5b61305b19360558403140493032ae5fe3c4eaa912e6f9e4ec1f1f18fa1f4df30db6221d0ed02a71f120667295f208cc714c56128713afb48721015ce8856580315594b79052e892020481d8799f581ce1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38da140a1401864fff5f6.

When I deserialize this with Transaction.from_bytes and reserialize it with to_bytes, I get this different CBOR: 84a60083825820259cf705d9a347bea54d3fc1853aea5ea0f6e48e6ca89f809585598a0d606d08018258208c8572a808f452f1c17217670c2e3bfa42e6fa6a45145211f86aeeb1c298a37401825820a9d502de07fb2d8e99e1948a17a4909289dc0ec5fd49cac219f449612f9b847400018382583900e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f6161a34bdc86b82583900e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616821a001d76f0a4581c0ce17ede312dcd14bdca75cf3e46b96f5dc6e24ebf7aa41e8b2e685ea1435454541a0001e240581c2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7a144476f6c641a3b9ac99c581c42d75b0432a8a9d99ca395e054690be425684f7fef58c35e005a5b78a144546573741a075bcd15581cb32f5693ea55500f7bd41045b52f08d2ae7f2fcb2d153584ab576db2a1467878787878781903e783581d70a653f77bfca4acf94cacfdd6d59d6965f57f3e2cc91136830b6ea8f5821a0019c812a1581c2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7a144476f6c6418645820fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35021a000313410b58206d0359d7143d033d222250b8be376dec73a9515aa2fde7ca452661aad6e370640d800e81581ce1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38da200818258200717bc56ed4897c3dde0690e3d9ce61e28a55f520fde454f6b5b61305b19360558403140493032ae5fe3c4eaa912e6f9e4ec1f1f18fa1f4df30db6221d0ed02a71f120667295f208cc714c56128713afb48721015ce8856580315594b79052e892020481d87982581ce1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38da140a1401864f5f6.

And when I use the Cardano-API to deserialize that CBOR, I get the following transaction:

ShelleyTx ShelleyBasedEraAlonzo (ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxIn (TxId {_unTxId = SafeHash "259cf705d9a347bea54d3fc1853aea5ea0f6e48e6ca89f809585598a0d606d08"}) 1,TxIn (TxId {_unTxId = SafeHash "8c8572a808f452f1c17217670c2e3bfa42e6fa6a45145211f86aeeb1c298a374"}) 1,TxIn (TxId {_unTxId = SafeHash "a9d502de07fb2d8e99e1948a17a4909289dc0ec5fd49cac219f449612f9b8474"}) 0], _collateral = fromList [], _outputs = StrictSeq {fromStrict = fromList [(Addr Testnet (KeyHashObj (KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d")) (StakeRefBase (KeyHashObj (KeyHash "1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616"))),Value 884852843 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d")) (StakeRefBase (KeyHashObj (KeyHash "1b930e9f7add78a174a21000e989ff551366dcd127028cb2aa39f616"))),Value 1930992 (fromList [(PolicyID {policyID = ScriptHash "0ce17ede312dcd14bdca75cf3e46b96f5dc6e24ebf7aa41e8b2e685e"},fromList [("TTT",123456)]),(PolicyID {policyID = ScriptHash "2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7"},fromList [("Gold",999999900)]),(PolicyID {policyID = ScriptHash "42d75b0432a8a9d99ca395e054690be425684f7fef58c35e005a5b78"},fromList [("Test",123456789)]),(PolicyID {policyID = ScriptHash "b32f5693ea55500f7bd41045b52f08d2ae7f2fcb2d153584ab576db2"},fromList [("xxxxxx",999)])]),SNothing),(Addr Testnet (ScriptHashObj (ScriptHash "a653f77bfca4acf94cacfdd6d59d6965f57f3e2cc91136830b6ea8f5")) StakeRefNull,Value 1689618 (fromList [(PolicyID {policyID = ScriptHash "2e09608c2f9f942478ff7743b98ca1268fa2caa766b42325b072bcd7"},fromList [("Gold",100)])]),SJust (SafeHash "fe87814cd2b227260cd2c3d5c170fd7643f33a3ea90e64b119d70fbaee4f8d35"))]}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 201537, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [KeyHash "e1cbb80db89e292269aeb93ec15eb963dda5176b66949fe1c2a6a38d"], _mint = Value 0 (fromList []), _scriptIntegrityHash = SJust (SafeHash "6d0359d7143d033d222250b8be376dec73a9515aa2fde7ca452661aad6e37064"), _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [WitVKey' {wvkKey' = VKey (VerKeyEd25519DSIGN "0717bc56ed4897c3dde0690e3d9ce61e28a55f520fde454f6b5b61305b193605"), wvkSig' = SignedDSIGN (SigEd25519DSIGN "3140493032ae5fe3c4eaa912e6f9e4ec1f1f18fa1f4df30db6221d0ed02a71f120667295f208cc714c56128713afb48721015ce8856580315594b79052e89202"), wvkKeyHash = KeyHash "e99ac432ee241d8d7311b17be68743a19dc7a6c33f8a80e44b664c4f", wvkBytes = "\130X \a\ETB\188V\237H\151\195\221\224i\SO=\156\230\RS(\165_R\SI\222EOk[a0[\EM6\ENQX@1@I02\174_\227\196\234\169\DC2\230\249\228\236\US\US\CAN\250\USM\243\r\182\"\GS\SO\208*q\241 fr\149\242\b\204qLV\DC2\135\DC3\175\180\135!\SOH\\\232\133e\128\&1U\148\183\144R\232\146\STX"}], _txwitsBoot = fromList [], _txscripts = fromList [], _txdats = TxDatsRaw (fromList [(SafeHash "1ff695c54bd64ced4b681756423db67ff357b2f080a9854c7013a5c4960a8cb2",DataConstr Constr 0 [B "\225\203\184\r\184\158)\"i\174\185>\193^\185c\221\165\ETBkf\148\159\225\194\166\163\141",Map [(B "",Map [(B "",I 100)])]])]), _txrdmrs = RedeemersRaw (fromList [])}, isValid = IsValid True, auxiliaryData = SNothing})

The datum hash mentioned in the script output is as before (as it should), but the hash in the txdats field has changed to 1ff695c54bd64ced4b681756423db67ff357b2f080a9854c7013a5c4960a8cb2, which seems to be a bug - and causes problems down the line when trying to sign and submit this transaction.

rooooooooob commented 2 years ago

In case anyone else is looking at this in the meantime I just want to post the CBOR differences I noticed:

Cardano-API:

cardano-serialization-lib:

We encoded ours using the canonical CBOR form that's used for hashing

SebastienGllmt commented 2 years ago

This is unfortunately expected. Unfortunately because of the way Cardano & cbor work, roundtrip is never guaranteed to work and the advice of IOHK has ways been to just never do roundtrip anywhere. If a certain use case really requires roundtrip support, your only option is to create a CIP to define some canonical encoding such as the one for hardware wallets and then hope projects support it.

That doesn't mean we shouldn't do our best to try and make things just magically work, but you shouldn't rely on it if possible

rooooooooob commented 2 years ago

Isn't CBOR that is meant to be hashed (in the context of Cardano) supposed to follow the general canonical CBOR format anyway? The one with definite lengths and that lex byte order (I think) key order for maps. If there is a different hash being generated when this is going back into the Cardano API then it sounds like there might be a bug there (not canonicizing before hashing?) unless I'm misunderstanding things or thinking about a different hash. Looking at the CBOR the differences I saw seemed to show we were encoding in canonical form.

brunjlar commented 2 years ago

Thank you for your replies, @rooooooooob & @SebastienGllmt !

The issue is not the roundtrip per se, @SebastienGllmt, I just deserialized the transaction back in the Cardano API to try and understand why I could not sign and submit the transaction successfully.

The issue is that starting from a valid (albeit unsigned) transaction, when I deserialize it using the cardano-serialization-lib, then take it apart (using body() etc.), sign it and reassemble it, I get an invalid transaction because of this hash mismatch.

It doesn't matter (to me) if the hash changes due to different CBOR-serialization, but I would expect it to stay consistent - so if the hash changes in the txdats-field, I would expect it to also change in the transaction output.

Can you think of any way to work around this?

brunjlar commented 2 years ago

I see that in the plutus repository, a "clever" ad-hoc way of CBOR-serialisation ist used: https://github.com/input-output-hk/plutus/blob/c8c5183f7facd967d48fe07b3b14465b8dd48fe7/plutus-core/plutus-core/src/PlutusCore/Data.hs#L56

Does this - in your opinion - contradict the specification? If so, could you please point me to the contradicting specification? If not, the cardano-serialization-lib should implement the same serialization in my opinion, seeing as this is really important for hash consistency.

rooooooooob commented 2 years ago

@brunjlar What version are you using? I looked into this again and I think the root problem here is something we ran into before, I just didn't remember and put 2 and 2 together yesterday since I was too busy focusing on the fact that the tx body is different too. I believe that part is indeed canonicized using the same encoding we export to when computing the hash in IOHK's code so the txid stayed the same so that wasn't an issue. The same thing is not done however for plutus datums it seems. We had issue #227 which was resolved in #228 which I think is causing your problem. This will be included in release 10.0.0 so try giving the 10.0.0-beta-8 release a try.

I'll post an update in this issue soon when I find out whether the datum hash was supposed to be in canonical form or if we have to either a) be respectful of the exact format when deserializing for if we re-serialize or b) conform to some arbitrary way the cardano-node does it.

Hopefully in the meantime trying that 10.0.0-beta helps.

rooooooooob commented 2 years ago

I see that in the plutus repository, a "clever" ad-hoc way of CBOR-serialisation ist used

In case you're referring to the bulk of those comments involving either those compact tags (always worked but became less error-prone with #250 / 10.0.0) or the 64 byte workaround clverness (as of #193 / 8.1.0) we do support them.

brunjlar commented 2 years ago

Thank you very much, @rooooooooob. I upgraded to 10.0.0-beta.8, and that fixes the problem.

(I did have a workaround before, but that involved going back to Haskell und fixing the transaction there by correcting all the hashes.)

brunjlar commented 2 years ago

I tried it with another transaction, where there is a public key output with an attached datum, and the problem reappeared. Is it possible that even version 10.0.0-beta.8 does not handle the case of public key outputs with attached datum, only script outputs?

rooooooooob commented 2 years ago

I looked into it and it looks like the correct answer to the above was a) as there is no specific format and no canonization should be done, so all tools must respect whatever format they are given. I was just thinking it might be a canonization issue at first as we had that happen before where it turned out to be a bug in the canonization in cardano-node. #228 just made the plutus list fall in line with cardano-cli but that wasn't the root problem, just a workaround to make it be consistent with cardano-cli created plutus lists. I'm working on a fix for this right now and I'll update this hopefully later today with a PR.

rooooooooob commented 2 years ago

See #317 for a potential fix PR