cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.2k stars 3.59k forks source link

Proto representation of types.Dec unintuitive and inconsistent #10863

Open webmaster128 opened 2 years ago

webmaster128 commented 2 years ago

Right now, a decimal of type types.Dec is encoded in the protobuf API as string(int(decimal * 10**18)) or ascii(string(int(decimal * 10**18))). While this is straight forward and efficient once you know how decimals work, it's very unintuitive if you look at the output the API gives you, such as

147768741-08da8a14-f49c-4d80-a192-3406ca231fde

or

Uint8Array(18) [
  49, 51, 48, 48, 48, 50, 50,
  48, 51, 54, 55, 49, 53, 52,
  57, 54, 55, 51
]
// 313330303033343138373830313631333938 (hex)

In both cases the example values as well as the non-Go proto definitions do not provide hints that those are encoded decimals.

Then those decimals are sometimes encoded as strings, sometimes as bytes. E.g.

// CommissionRates defines the initial commission rates to be used for creating
// a validator.
message CommissionRates {
  option (gogoproto.equal)            = true;
  option (gogoproto.goproto_stringer) = false;

  // rate is the commission rate charged to delegators, as a fraction.
  string rate = 1 [(gogoproto.customtype) = "github.com/cosmos/cosmos-sdk/types.Dec", (gogoproto.nullable) = false];
  // max_rate defines the maximum commission rate which validator can ever charge, as a fraction.
  string max_rate = 2 [
    (gogoproto.moretags)   = "yaml:\"max_rate\"",
    (gogoproto.customtype) = "github.com/cosmos/cosmos-sdk/types.Dec",
    (gogoproto.nullable)   = false
  ];
  // max_change_rate defines the maximum daily increase of the validator commission, as a fraction.
  string max_change_rate = 3 [
    (gogoproto.moretags)   = "yaml:\"max_change_rate\"",
    (gogoproto.customtype) = "github.com/cosmos/cosmos-sdk/types.Dec",
    (gogoproto.nullable)   = false
  ];
}

and

// QueryInflationResponse is the response type for the Query/Inflation RPC
// method.
message QueryInflationResponse {
  // inflation is the current minting inflation value.
  bytes inflation = 1 [(gogoproto.customtype) = "github.com/cosmos/cosmos-sdk/types.Dec", (gogoproto.nullable) = false];
}

This hits all users that interact with the proto interface directly and do not use a higher level wrapper. Given the amount of deeply nested fields that would need wrapping, the need for custom module code generation and the desire to use reflection, this is a severe burden to DevX that will affect a lot of newcomers.

I think there needs to be a better strategy how to encode types.Dec at the protobuf API level. One option is using string representation with decimal points. The other option that avoids string parsing is a

message Dec {
  string whole = 1; // unlimited range, like Coin's amount
  int64 fractional = 2; // 18 digits after the decimal dot
}

similar to how Timestamp is defined.

aaronc commented 2 years ago

Oh no 😱. So I think this should basically classified as a bug. It appears that the proto JSON is to actually use a decimal place (i.e. 0.10000000000), but the proto binary doesn't have a decimal place and does this weird encoding you're describing. This is both against the proto3 spec and unintuitive. I had never caught this previously because I was always looking at proto JSON which encodes things more or less as you'd expect. The reason the proto binary and JSON differ I believe has to do with the way gogo proto works and that's just simply wrong.

So I'm not really sure what to do. Fixing this on master would be state breaking and require migrations, and also break clients. At the same time, the current encoding is just simply wrong...

We are planning to switch decimal implementations over to something that is GDA compliant, so an alternative would be to consider this Dec to be a purely legacy types and newer proto packages should use the new type with a correct string representation (or some struct like you're suggesting).

bytes should never be used with Dec afaik and QueryInflationResponse just appears to be wrong.

The struct approach may be a good way to go, but I'm not really sure it's preferable over a string. An alternative could be an int64 mantissa with an int32 exponent which should provide plenty of precision for most use cases.

webmaster128 commented 2 years ago

Thank you for your understanding and support on that issue, Aaron!

So I'm not really sure what to do. Fixing this on master would be state breaking and require migrations, and also break clients. At the same time, the current encoding is just simply wrong...

Same here

We are planning to switch decimal implementations over to something that is GDA compliant, so an alternative would be to consider this Dec to be a purely legacy types and newer proto packages should use the new type with a correct string representation (or some struct like you're suggesting).

Sounds plausible

The struct approach may be a good way to go, but I'm not really sure it's preferable over a string.

The string is more beginner friendly for sure. Also in many cases devs don't bother about the difference between decimals and floats. And it is probably okay for a lot of use cases such as displaying a rounded inflation rate.

I was kindof leaning towards the struct to create a hard type break that hurts at compile time rather than silently changing string content from one encoding to another. But there are probably better migration strategies.

An alternative could be an int64 mantissa with an int32 exponent which should provide plenty of precision for most use cases.

Going from fixed point (18 decimals) to floating point would heavily increase the complexity of the type, especially if you want to do math between types of different exponent. Not sure if that is needed. The feedback we got in CosmWasm is that everybody is happy with 18 decimal points fixed but uint128 is not sufficient for some Ethereum-related use cases. A uint256 integer with a fixed decimal point at 18 seems to make everyone happy.

webmaster128 commented 2 years ago

In https://github.com/cosmos/cosmjs/pull/969/files you see function decodeCosmosSdkDecFromProto(, which takes any of the two possible proto representations and decodes it. This can easily be changed to accept a 3rd encoding. So decoding many different representations is not that hard once you realized it is a Decimal.

aaronc commented 2 years ago

For the new SDK decimal type, my thinking was to allow users to choose what precision they do calculations with and store as an arbitrary precision string.

In what use cases do people actually need a decimal value that can't fit into a uint128? That would be a very big number and I'm not sure what real world number would be that large and also require 18 decimal digits.

One hacky solution could be to parse any decimal strings that contain a . as an actual decimal and parse integers by dividing by 10^18. That would at least be backwards compatible, but possibly more confusing...

webmaster128 commented 2 years ago

For the new SDK decimal type, my thinking was to allow users to choose what precision they do calculations with and store as an arbitrary precision string.

Okay, so I guess the important question is if you ever want to support math with mixed exponents. E.g. if you do Decimal { amount 123, exponent: 3 } + Decimal { amount 4567, exponent: 6 }, what exponent will the result have. Do you truncate information at the back? Do you risk overflows in the front? I'd only support math between Decimals of the same exponent value.

In what use cases do people actually need a decimal value that can't fit into a uint128? That would be a very big number and I'm not sure what real world number would be that large and also require 18 decimal digits.

Defi stuff I understand very little about. See e.g. Decimal256 in TerraSwap. Other sane community members like @marbar3778 confirmed Decimal256 is something that is needed.

One hacky solution could be to parse any decimal strings that contain a . as an actual decimal and parse integers by dividing by 10^18. That would at least be backwards compatible, but possibly more confusing...

You are taking about client code consuming those things? Yeah it would be nice if the new string representation was guaranteed to have a decimal point, such that an integer cannot be the old representation and a decimal with zero fractional part at the same time.

aaronc commented 2 years ago

For the new SDK decimal type, my thinking was to allow users to choose what precision they do calculations with and store as an arbitrary precision string.

Okay, so I guess the important question is if you ever want to support math with mixed exponents. E.g. if you do Decimal { amount 123, exponent: 3 } + Decimal { amount 4567, exponent: 6 }, what exponent will the result have. Do you truncate information at the back? Do you risk overflows in the front? I'd only support math between Decimals of the same exponent value.

Hmm... I think we would just do the normal thing that GDA specifies and work with different exponents and specify a rounding precision and flags. We should require that inputs adhere to a precision limit (we do this in the regen-ledger credits module using a GDA implementation), but intermediate calculations may mix exponents depending on how the decimal library represents things.

minxylynx commented 1 year ago

You are taking about client code consuming those things? Yeah it would be nice if the new string representation was guaranteed to have a decimal point, such that an integer cannot be the old representation and a decimal with zero fractional part at the same time.

This is super important for clients that support backwards compatibility - being able to maintain the difference between the old type and the new type.

amaury1093 commented 1 year ago

@tac0turtle @aaronc Is it realistic for 0.48 to do a SDK wide state-machine-breaking migration to update all decimals to a string representation with decimal point?

MSalopek commented 2 months ago

We're facing the same issue on the cosmos-hub - affected versions: v0.47.x, v0.50.x.

In our case the behaviour could lead to overspends and similar types of issues.

Here is our scenario:

The Hub now uses the feemarket module that specifies gas prices using DecCoins. The API/REST responses seem to be converting to the appropriate string representation but the grpc calls do not.

# RPC
gaiad q feemarket gas-price uatom --node "<RPC>" -o json | jq .
{
  "price": {
    "denom": "uatom",
    "amount": "0.005000000000000000"
  }
}
# grpc
grpcurl -plaintext cosmos-grpc.polkachu.com:14990 feemarket.feemarket.v1.Query.GasPrices
{
  "prices": [
    {
      "denom": "uatom",
      "amount": "5000000000000000"
    }
  ]
}

This could lead to unforseen issues when estimating gas as developers are not aware that they should convert the amount field.

The same issue exists in all places where github.com/cosmos/cosmos-sdk/types.DecCoins is used in the proto defintion - e.g.:


// FeePool is the global fee pool for distribution.
message FeePool {
  repeated cosmos.base.v1beta1.DecCoin community_pool = 1 [
    (gogoproto.nullable)     = false,
    (amino.dont_omitempty)   = true,
    (gogoproto.castrepeated) = "github.com/cosmos/cosmos-sdk/types.DecCoins"
  ];
}

grpcurl -plaintext <GRPC_URL> cosmos.distribution.v1beta1.Query.CommunityPool
{
  "pool": [
    ...
    {
      "denom": "uatom",
      "amount": "5632244209811957774374296404803"
    }
  ]
}

In this case, the RPC returns:

gaiad q distribution community-pool --node <RPC> -o json | jq .
{
  "pool": [
    {
      "denom": "uatom",
      "amount": "5632244209811.957774374296404803"
    }
  ]
}

The suggestion to divide by 10^18 and handle should work for int64 numbers.

Is there a solution to this?

julienrbrt commented 2 months ago

ref: https://github.com/cosmos/cosmos-sdk/pull/20803

webmaster128 commented 2 months ago

What you describe @MSalopek is expected given the issue in this ticket. The protobuf storage and API store those values as integers which is caused by the underlying decimal implementation. Changing this would not only break APIs but also all state where the data type is used.

The only possible solution I am aware of to create a different decimal type with a strictly specified string representation and gradually migrate from the old to the new one.

The Amino JSON encoding of this is a separate issue (https://github.com/cosmos/cosmos-sdk/issues/18546 and friends). In Amino JSON the dot was used to be added but than the JSON representation changed somehow.

The suggestion to divide by 10^18 and handle should work for int64 numbers.

What I did in CosmJS is string mutation as you know this is 18 digit fixed point and can easily exceed the 64bit range.

kocubinski commented 2 months ago

I did some digging, this difference in JSON wire format vs proto format has been in place since v0.45:

❯ curl -s https://cosmos-api.polkachu.com/cosmos/mint/v1beta1/params | jq
{
  "params": {
    "mint_denom": "uatom",
    "inflation_rate_change": "1.000000000000000000",
    "inflation_max": "0.100000000000000000",
    "inflation_min": "0.070000000000000000",
    "goal_bonded": "0.670000000000000000",
    "blocks_per_year": "4360000"
  }
}

❯ grpcurl -plaintext cosmos-grpc.polkachu.com:14990 cosmos.mint.v1beta1.Query.Params
{
  "params": {
    "mintDenom": "uatom",
    "inflationRateChange": "1000000000000000000",
    "inflationMax": "100000000000000000",
    "inflationMin": "70000000000000000",
    "goalBonded": "670000000000000000",
    "blocksPerYear": "4360000"
  }
}

The proto format (on disk and on network) does not contain a radix point and is omitted, the LegacyDec number format conventionally assumes base=10 exponent=-18 for its placement. Canonically we should call the number format returned by JSON RPC Amino ProtoJSON (a different ProtoJSON spec) since some transformation is being done on top of the format of the data at rest. If you're curious, the gateway interface is setup here which ultimately calls math.LegacyDec.String(). Naturally grpcurl, indeed any reasonable GRPC client, doesn't know anything about this number format so bytes are rendered literally.

Clearly this is super confusing. The only reasonable fix I can imagine from the SDK side would be to offer a feature flag which (if set) Marshals LegacyDec to bytes containing a radix point, and Unmarshals both formats identically. Functionally it would be lazy migration of decimals in state, and is state breaking / requires a coordinated chain upgrade, so off by default. It's a lot of machinery for a decimal point.

On the other hand, Amino JSON rendering could also just be dropped from JSON RPC but this would be pretty breaking in its own right, so I'm inclined to leave it as is. We're close to coming to consensus on a wire format for GDA based decimal type which we'll recommend new proto specs use.