witnet / witnet-rust

Open source Rust implementation of the Witnet decentralized oracle protocol, including full node and wallet backend 👁️🦀
https://docs.witnet.io
GNU General Public License v3.0
179 stars 56 forks source link

Operator parseJSONMap fails on JSON objects containing long integer values #2310

Open guidiaz opened 1 year ago

guidiaz commented 1 year ago

For instance: {"data": 122637770461000000000}

tmpolaczyk commented 1 year ago

Alright, so this issue seems to be because of CBOR limitations. When executing an operator like StringParseJsonMap, the data is converted first from JSON to CBOR, and then from CBOR to RadonType:

JSON -> CBOR -> RadonType

A RadonInteger has a limit of -2^127 to 2^127 - 1, but CBOR has a lower limit for the integer size, from -2^64 to 2^64 - 1. So the conversion results in an error:

Encode { from: "cbor::value::Value", to: "RadonMap" }

That issue can be fixed by skipping the intermediate CBOR conversion, but then a bigger problem arises: RadonTypes are serialized as CBOR when used in the network protocol. Therefore, as soon as a RadonInteger reaches 2^64, it will be an error to serialize it. For example, this is an error that is seen in the node when a data request uses the IntegerMultiply operator to convert an input of [1] into an integer greater than 2^64 (1 * 1000000000000000 * 100000):

Couldn't decode tally value from bytes: Failed to encode RadonInteger into Vec<u8>

(the error says tally value but this is the result of the aggregation phase)

In that case the data request resolves with 0 commits, because it is not possible to serialize the result. Sheikah doesn't detect that error when using "try data request", so I guess the toolkit will not notice this error as well.

So possible solutions for this general integer serialization issue:

I believe all of them are breaking changes.

guidiaz commented 1 year ago

On StringParseJsonArray, StringParseJsonMap and StringParseXmlMap, I would suggest to:

guidiaz commented 1 year ago

In that case the data request resolves with 0 commits, because it is not possible to serialize the result. Sheikah doesn't detect that error when using "try data request", so I guess the toolkit will not notice this error as well.

@tmpolaczyk Shouldn't the node commit something like an "aggregation overflow error" instead?

As a matter of fact, there's a specific enum value for this situation defined in the Witnet.sol library.

tmpolaczyk commented 1 year ago

@tmpolaczyk Shouldn't the node commit something like an "aggregation overflow error" instead?

As a matter of fact, there's a specific enum value for this situation defined in the Witnet.sol library.

Yeah, it should, but this error has never happened before so the current behavior is to not commit anything. Basically here instead of returning an err we should return an ok with serialization of a dedicated radon error like RadError::Encode:

https://github.com/witnet/witnet-rust/blob/2fa9a2fdae4501967d739d9884aaaef1882b02da/node/src/actors/chain_manager/mining.rs#L592

guidiaz commented 1 year ago

Would it make sense to rephrase the log error message to "Couldn't encode value from < whatever > into < RadonType >: {}" ?

tmpolaczyk commented 1 year ago

Would it make sense to rephrase the log error message to "Couldn't encode value from < whatever > to < RadonType >: {}" ?

Yes, that log is wrong.

tmpolaczyk commented 1 year ago

Opened #2312 to track the issue of nodes not commiting anything in case of encode error.

tmpolaczyk commented 1 year ago

The CBOR RFC seems to indicate that big integers can be supported by representing them as bytes or strings:

https://www.rfc-editor.org/rfc/rfc8949.html#name-bignums

But not sure if this solution works well for our case, because:

aesedepece commented 1 year ago
* Negative integers are supposed to be represented by strings, but not all integers are valid utf8 strings. The two CBOR libraries that we use (cbor and serde_cbor) do not allow non-utf8 strings. So we would need to use a different CBOR library that allows using non-utf8 strings.

Now I'm curious of which integer numbers cannot be formatted as strings :thinking:

tmpolaczyk commented 1 year ago

Now I'm curious of which integer numbers cannot be formatted as strings thinking

I guess most of them, for example -100000000000000000000 is encoded as

C3                       # tag(3)
   49                    # bytes(9)
      056BC75E2D630FFFFF # "\u0005k\xC7^-c\u000F\xFF\xFF"

And "\u0005k\xC7^-c\u000F\xFF\xFF" is not valid UTF8, so we wouldn't be able to parse it with the current libraries.