graphprotocol / graph-node

Graph Node indexes data from blockchains such as Ethereum and serves it over GraphQL
https://thegraph.com
Apache License 2.0
2.89k stars 962 forks source link

Bug: BigInt.toString produces incorrect decimal string for large uint256 numbers #4029

Open JonathanAmenechi opened 1 year ago

JonathanAmenechi commented 1 year ago

Do you want to request a feature or report a bug? Bug

What is the current behavior? Calling BigInt.toString in graph-ts execution context produces an incorrect results for large numbers

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

Given this hex: 0x344435891c9a299443ad1ff6f97829fcf3aff343795f7eb86730d1d0f968cb54, various popular js libraries successfully convert the hex to the following decimal string: 23640783215816018536649662874651199072550547213372823297679392888119521495892

In the graph-ts execution context, BigInt.toString should return the same decimal string. However BigInt.fromByteArray(ByteArray.fromHexString(hex)).toString() returns the following: 38353673751782187045679142956196807757628143702081891553022150933500130575412, which is incorrect.

azf20 commented 1 year ago

Hi @JonathanAmenechi thanks for the report - @evaporei one thing that occurs to me is whether the conversion from hex to byte array to bigint will work as illustrated?

github-actions[bot] commented 1 year ago

Looks like this issue has been open for 6 months with no activity. Is it still relevant? If not, please remember to close it.

maoueh commented 1 year ago

This should be moved to https://github.com/graphprotocol/graph-tooling repo instead since the bug is there.

azf20 commented 1 year ago

Does seem like that may be the case @maoueh - @leoyvens wondering if you have any ideas at a quick glance?

leoyvens commented 1 year ago

The issue is probably big endian vs little endian, and the result would be correct for BigInt.fromByteArray(ByteArray.fromHexString(hex).reverse()). We should have a nicer way of doing this such as a BigInt.fromHex which assumes the hex string is big endian.

maoueh commented 1 year ago

0x344435891c9a299443ad1ff6f97829fcf3aff343795f7eb86730d1d0f968cb54 seems to be encoded in big endian, I'm using our Go tool to decode the hex to decimal (which expects the hex to be big endian) and it works:

to_dec 0x344435891c9a299443ad1ff6f97829fcf3aff343795f7eb86730d1d0f968cb54
23640783215816018536649662874651199072550547213372823297679392888119521495892

Now, I tried adding a unit test in graph-ts code but I realized the big_int_to_string is actually an intrinsics implemented in Rust, so I need to add a unit test in graph-node to see how it looks like.

maoueh commented 1 year ago

Ok so actually @leoyvens is right that is an endianess problem, but the expected input side for fromHexString is little endian and not big endian like mentioned in his comment. Hexadecimal number are in the majority of library interpreted as big-endian hence why the difference.

const bytes = ByteArray.fromHexString("344435891c9a299443ad1ff6f97829fcf3aff343795f7eb86730d1d0f968cb54").reverse()
const result = BigInt.fromByteArray(changetype<ByteArray>(bytes));

assert(typeConversion.bigIntToString(result) == "33640783215816018536649662874651199072550547213372823297679392888119521495892");

Pass as a unit test on graph-runtime-test.

The fromHexString documentation should be enhanced to discuss this common pitfall where hex number are decoded they were little endian which is "unexpected" but changing the current behavior might prove too big of a breaking change.

outdoteth commented 1 year ago

if you arrived here trying to convert an Address to a BigInt then this is how you can do it:

BigInt.fromUnsignedBytes(
      changetype<ByteArray>(
        ByteArray.fromHexString(
          fooBarAddress.toHexString()
        ).reverse()
      )
    )
richtera commented 4 months ago

Question to me is when is a little endian read of from a Bytes or ByteArray useful when reading data from chain? I don't recall any data being little endian except possibly the serialization of the U256 internal data which is even chunked into u64's, hmmm. Internally the U256 type has read and write functions for big and little endian into slices. Maybe there should just be a BigInt.fromUnsignedBytesBE() as an option. I added a decode_number to do that.