`tuple index 4: not enough bytes remaining: 354 < 384`

hawkaa commented 1 year ago

Hi @esaulpaugh and thanks again for an excellent library.

We're having some troubles using the library on this transaction: 0x996a15ed678a5276dbf446587a9bf53f90c3b961b7f711d58a5ec74d44b7d1c2

As seen in the parity traces number 2, the trace goes to 0xcd9dab5e666de980cecdc180cb31f296733e2587 with the following input data:

0x128acb08000000000000000000000000da3a20aad0c34fa742bd9813d45bbf67c787ae0b0000000000000000000000000000000000000000000000000000000000000000fffffffffffffffffffffffffffffffffffffffffdb9328a9ba5635a80846cdf000000000000000000000000fffd8963efd1fc6a506488495d951d5263988d2500000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000016200163301ee63fb29f863f2333bd4466acb46cd8323e620cd9dab5e666de980cecdc180cb31f296733e258700000246cd75645a9ca57f7b9320004bb0641100000000000000beebe34d7e60da18da3a20aad0c34fa742bd9813d45bbf67c787ae0b0000000000000000c34333f9af9c04005800132f152689a72b0a484b49a37fc85a862006f4138f1b000000000000000a31acd25057ef0000000000e17c19c8ee8c58656d0035189628a3e880960bd07f47bdfeb91de21a4c9b431a00000000000000000dbaac192510fe00940106450dee7fd2fb8e39061434babcfc05599a6fb8202a9d2ba41aba912316d16742f259412b681898db00000007f5368cf7b9ce855960c0004bb0641100000000000000475b46c4aab2f8109f5b65fa97d1fd596e9d111d483f90f69cd8102a000000000004ac23d1c23b55042949007118742f53651043cbab06602299064d59deb60a745300000000000000004e3fe9b51a631e

This maps to the swap function of the UniswapV3Pool with the following ABI:

{
  "inputs": [
    {
      "internalType": "address",
      "name": "recipient",
      "type": "address"
    },
    {
      "internalType": "bool",
      "name": "zeroForOne",
      "type": "bool"
    },
    {
      "internalType": "int256",
      "name": "amountSpecified",
      "type": "int256"
    },
    {
      "internalType": "uint160",
      "name": "sqrtPriceLimitX96",
      "type": "uint160"
    },
    {
      "internalType": "bytes",
      "name": "data",
      "type": "bytes"
    }
  ],
  "name": "swap",
  "outputs": [
    {
      "internalType": "int256",
      "name": "amount0",
      "type": "int256"
    },
    {
      "internalType": "int256",
      "name": "amount1",
      "type": "int256"
    }
  ],
  "stateMutability": "nonpayable",
  "type": "function"
}

The transaction is successful and the corresponding Swap event is emitted.

However, trying to decode this data with headlong gives the following stack trace:

Caused by: java.lang.IllegalArgumentException: tuple index 4: not enough bytes remaining: 354 < 384
    at com.esaulpaugh.headlong.abi.TupleType.decodeException(TupleType.java:316)
    at com.esaulpaugh.headlong.abi.TupleType.decode(TupleType.java:246)
    at com.esaulpaugh.headlong.abi.TupleType.decode(TupleType.java:30)
    at com.esaulpaugh.headlong.abi.ABIType.decode(ABIType.java:176)
    at com.esaulpaugh.headlong.abi.Function.decodeCall(Function.java:231)

After some debugging it seems like the offset for the last bytes array is wrong. It is probably set incorrectly in the data? However, the transaction goes through, so I'm a little unsure whats the expected behaviour for this trace.

Let me know what you think!

Håkon

hawkaa commented 1 year ago

Yes. Forgot that we are on version 6.3.0 of the library. I upgraded to 9.1.1 and I have the issue on both versions.

esaulpaugh commented 1 year ago

This looks like the same padding issue in https://github.com/esaulpaugh/headlong/issues/53

If array padding is ignored, the result is:

[0xDA3A20aaD0C34fA742BD9813d45Bbf67c787aE0b, false, -704565077645023467284435745, 1461446703485210103287273052203988822378723970341, [0, 22, 51, 1, -18, 99, -5, 41, -8, 99, -14, 51, 59, -44, 70, 106, -53, 70, -51, -125, 35, -26, 32, -51, -99, -85, 94, 102, 109, -23, -128, -50, -51, -63, -128, -53, 49, -14, -106, 115, 62, 37, -121, 0, 0, 2, 70, -51, 117, 100, 90, -100, -91, 127, 123, -109, 32, 0, 75, -80, 100, 17, 0, 0, 0, 0, 0, 0, 0, -66, -21, -29, 77, 126, 96, -38, 24, -38, 58, 32, -86, -48, -61, 79, -89, 66, -67, -104, 19, -44, 91, -65, 103, -57, -121, -82, 11, 0, 0, 0, 0, 0, 0, 0, 0, -61, 67, 51, -7, -81, -100, 4, 0, 88, 0, 19, 47, 21, 38, -119, -89, 43, 10, 72, 75, 73, -93, 127, -56, 90, -122, 32, 6, -12, 19, -113, 27, 0, 0, 0, 0, 0, 0, 0, 10, 49, -84, -46, 80, 87, -17, 0, 0, 0, 0, 0, -31, 124, 25, -56, -18, -116, 88, 101, 109, 0, 53, 24, -106, 40, -93, -24, -128, -106, 11, -48, 127, 71, -67, -2, -71, 29, -30, 26, 76, -101, 67, 26, 0, 0, 0, 0, 0, 0, 0, 0, 13, -70, -84, 25, 37, 16, -2, 0, -108, 1, 6, 69, 13, -18, 127, -46, -5, -114, 57, 6, 20, 52, -70, -68, -4, 5, 89, -102, 111, -72, 32, 42, -99, 43, -92, 26, -70, -111, 35, 22, -47, 103, 66, -14, 89, 65, 43, 104, 24, -104, -37, 0, 0, 0, 7, -11, 54, -116, -9, -71, -50, -123, 89, 96, -64, 0, 75, -80, 100, 17, 0, 0, 0, 0, 0, 0, 0, 71, 91, 70, -60, -86, -78, -8, 16, -97, 91, 101, -6, -105, -47, -3, 89, 110, -99, 17, 29, 72, 63, -112, -10, -100, -40, 16, 42, 0, 0, 0, 0, 0, 4, -84, 35, -47, -62, 59, 85, 4, 41, 73, 0, 113, 24, 116, 47, 83, 101, 16, 67, -53, -85, 6, 96, 34, -103, 6, 77, 89, -34, -74, 10, 116, 83, 0, 0, 0, 0, 0, 0, 0, 0, 78, 63, -23, -75, 26, 99, 30]]

It may be that the input was encoded the old (wrong) way, according to Solidity v0.4.X and earlier i.e. pre-0.5).

One solution may be for headlong to provide a separate API which expects no array padding. But that is difficult or impossible to do without duplicating a lot of code. It would be up to the calling code to decide whether to invoke standard decoding or pre-Solidity 0.5 decoding.

What might be better is that it may be possible to write a utility method which will detect when padding is missing and output the data with the correct padding.

hawkaa commented 1 year ago

Hi @esaulpaugh ! Thank you, it is likely that the array padding is indeed the issue.

What might be better is that it may be possible to write a utility method which will detect when padding is missing and output the data with the correct padding.

I also think this could be a good idea. Do you think this will be straight forward? I guess you need the ABIType together with the bytes in order to perform this correction?

esaulpaugh commented 1 year ago

@hawkaa yes, the schema (type information) would be required. it would be difficult to do in the general case, but maybe it's possible if the data is passed as a byte[] (not a ByteBuffer) i.e. there is no ambiguity about where the input ends. I'll have to think about it.

esaulpaugh commented 1 year ago

I think the correct solution is to just for users to compile a version of headlong that expects no array padding.

This is the code that currently expects padding:

int mod = Integers.mod(len, UNIT_LENGTH_BYTES);
if(mod != 0) {
    byte[] padding = new byte[UNIT_LENGTH_BYTES - mod];
    bb.get(padding);
    for (byte b : padding) {
        if(b != Encoding.ZERO_BYTE) throw new IllegalArgumentException("malformed array: non-zero padding byte");
    }
}

hawkaa commented 1 year ago

Hi @esaulpaugh ! We're still struggling with the aformentioned issue. I know you said we'd have to compile our own solution, which would mean to maintain our own fork.

Before we go down that route, wanted to double check if you still think this is the right approach. Would be nice to explore options where we didn't need to maintain our own fork. I'm sure there must be others as well who experience this issue.

esaulpaugh commented 1 year ago

I have just had an idea for how to implement a flag on the level of the ABIType instance rather than the class or system.

Let me know if this approximates what you're looking for:

https://github.com/esaulpaugh/headlong/tree/legacy-flag

hawkaa commented 1 year ago

Hi @esaulpaugh and thanks a ton for looking into this issue again. We're interacting with your library only a few places places. The first set is in the creation of ABIObject, i.e. Function and Event, and the second set is decodeCall, decodeReturn, decodeTopics and decodeData.

If I read your code correctly, this would mean we pass in the flags when we create ABIObject via the fromJson function. Which to me makes perfect sense.

If we enable legacy mode on all ABIs, what would be the consequences you think? Will this library be able to decode everything as it used to, or will it be a whack-a-mole situation and we would have to know upfront whether we need legacy or not?

esaulpaugh commented 1 year ago

Setting the legacy array flag would be to assume that all byte arrays (including strings) are encoded without padding (i.e. apparently as in Solidity before Solidity v0.5). If the legacy array flag is not set then the decoder would expect that all byte arrays have the appropriate padding according to the current ABI specification.

I do not see any way to in all cases determine which convention is being used from the bytes alone (and if such an algorithm were to exist, it should be added to the official ABI specification). Therefore I believe that the only sensible thing to do is to enable the end user of headlong to set the appropriate flags based on the knowledge they have about their data.

One possible treatment is for headlong users to attempt decoding with the legacy flag unset, and if an error occurs, attempt decoding again with the legacy flag set (or vice versa). Even if this tactic may resolve many problematic cases without error, I cannot recommend it because I do not in any way believe that it could work for all cases. In particular, I believe that some encodings would be decodeable under multiple sets of flags without error and it would be up to the end user to determine, with limited information, which, if any, are correct.

As I have alluded to before, headlong is not in the business of applying heuristics, and so I leave that up to the end user if they so choose.

hawkaa commented 1 year ago

As I have alluded to before, headlong is not in the business of applying heuristics, and so I leave that up the end user if they so choose.

I 100% agree with this.

I'll see if we can flag these contracts somehow and set the flags appropriately. If not, I think what you suggest with trying to decode it first with strict mode, and then without padding, would improve our coverage. I agree that it might not work for all cases but I think it should be a good step in the right direction for us.

esaulpaugh commented 1 year ago

I have added a clarifying edit above:

In particular, I believe that some encodings would be decodeable under multiple sets of flags without error and it would be up to the end user to determine, with limited information, which, if any, are correct.

esaulpaugh commented 1 year ago

https://github.com/esaulpaugh/headlong/pull/59

hawkaa commented 1 year ago

@esaulpaugh I took the latest master branch for a spin and it looks great.

I'm doing FLAGS_NONE as before, and fall back to FLAGS_LEGACY_DECODE in case of an illegal argument which contains "not enough bytes remaining". For a certain day in Ethereum (December 25, 2022), I'm able to decode 3316350 traces as opposed to 3315086, which is not a lot but most likely the events we are missing.

Do you plan on releasing this soon?

esaulpaugh commented 1 year ago

Yes, soon. Unless there are any requests for last-minute changes to the flags API, I think the only thing left to decide is if I want to call this version 10.0.0.

hawkaa commented 1 year ago

I liked the new API. What we ended up with was really clean.

esaulpaugh commented 1 year ago

fixed https://github.com/esaulpaugh/headlong/releases/tag/v10.0.0

esaulpaugh / headlong

`tuple index 4: not enough bytes remaining: 354 < 384` #54