Open charles-cooper opened 2 years ago
Related to reducing calldata cost: https://github.com/ethereum-optimism/optimistic-specs/issues/10
This ought to be an EIP itself. So much would have to change to support this, and it seems very risky for Vyper to adopt this without larger community buy-in.
This ought to be an EIP itself. So much would have to change to support this, and it seems very risky for Vyper to adopt this without larger community buy-in.
That's a great point. I mean this is definitely in the idea phase and we would not want to undertake this without larger community buy-in. Would you be interested in helping us to draft and shepherd through the EIP process?
Would you be interested in helping us to draft and shepherd through the EIP process?
Sure, absolutely! Also happy to reach out to Solidity folks. Would be nice if we could coordinate so languages don't end up having competing interface standards.
Would you be interested in helping us to draft and shepherd through the EIP process?
Also happy to reach out to Solidity folks.
Did you happen to connect with them on this?
So what's your current plans around this (since we had at least a very brief discussion about this at devconnect)? In general, the ABI should really be specified as a proper cross-language standard - the worst thing that could happen would be to fragment over this. So we should probably think about how best to organize this. In case you want to go ahead with this soon, we should probably try to schedule a call about it in the near future?
I'm also tagging some people from Fe as well: @g-r-a-n-t, @cburgdorf But we should also probably think about whom else to reach out to.
I'm not in a particular rush! But @gnidan and I were talking about putting together an EIP for this.
Does this have an advantage over using RLP? ABI arguments and return values can be represented as RLP right now.
Tuples and Arrays are represented by RLPList and everything else as RLPString: https://github.com/esaulpaugh/headlong-cli#decode
For example:
java -jar headlong-cli-1.1-SNAPSHOT.jar -me "(function[2][][],bytes24,string[1][1],address[],uint72,(uint8),(int16)[2][][1],(int32)[],uint40,(int48)[],(uint),bool,string,bool[2],int24[],uint40[1])" "f4f3f298191c766e29a65787b7155dd05f41292438467db93420cade98191c766e29a65787b7155dd05f41292438467db93420cade98191c766e29a65787b7155dd05f41292438467db93420cadec2c17ad594ff00ee01dd02cc03cafebabe990688077708660989fdfffffffffffffe04c107c8c7c6c109c382fff5c8c111c584ffffffed85fca527923bcac17ec786ffffffffff82c10a01866661726f7574c20101c6031483fffffac584fffffffe"
Handling negative numbers (something traditional RLP can't address because it doesn't have a schema like ABI does) has proved tricky, but I believe I have it all working correctly and fairly well tested. I know RLP is no longer the most fashionable thing, but it achieves its original design goal of space efficiency respectably, and if it ain't broke...
In RLP, the original example (string, string)
("abcd", "efg")
would be: 0x846162636483656667
, nine bytes.
@charles-cooper I have a proof-of-concept https://github.com/esaulpaugh/abiv3
We should collaborate.
The admittedly limited feedback I've received from experienced Solidity contract developers indicates that gas usage is their first, last, and only concern with respect to calldata, and that manual bit level hacking will always be cheaper. There also appears to be a strong bias towards compatibility with existing contracts as opposed to compatibility among future contracts.
I have no idea what I'm doing python-wise but an attempt was made: https://github.com/esaulpaugh/abiv3/tree/master/python
Work in progress.
@gnidan @charles-cooper
I've got Java and Python prototypes set up to use an unsigned integer as the function selector instead of a hash, to save space. Integer arrays can also encode elements fixed-width or variable-width as desired, and both are equally valid.
Fixed-width enables constant-time random access to array elements which is useful for very large arrays. And it can be more space-efficient in some cases too because values are padded only to the width of the widest element and not to the width of the datatype. And they don't require an RLP prefix per-element.
I'd be interested to know what y'all think and whether anyone wants to jump on a call about an EIP. I'm @esaulpaugh on telegram.
I plan on submitting an ABIv3 draft EIP soon, so if anyone wants to help author it, let me know. I'm also working on some reference code in Yul to demonstrate decoding in the EVM.
@gnidan @charles-cooper
one interesting thing i realized while chatting with @esaulpaugh is that "calldata" is comparatively cheaper for inter-contract calls. so it might be worth having a more compressed encoding for calldata from eoa-initiated txns and an unpacked encoding (which is easier to decode) for inter-contract calls.
Does this have an advantage over using RLP? ABI arguments and return values can be represented as RLP right now.
Using RLP is pretty much a non-starter here as it is very inefficient from an encoding/decoding perspective. It is also not simple, which is important when we consider implementation correctness. For instance, to decode an RLP int, the pseudocode looks something like
int = shr 248 (calldataload ptr)
len = 1
if (int_byte) > 0x80:
len = add(len, sub(int, 80))
int = shr(sub(32, mul(8, sub(int_byte, 80)), calldataload( add (1, ptr)))
ptr = add(ptr, len)
this is already like a couple dozen instructions / 100 gas, as compared to the proposed alternatives of small/packed ints (shift and mask after calldataload) or varints with a length byte (two shifts and masks after calldataload). In other words, efficiency and simplicity of the encoder/decoder need to be considered as well.
stub issue, some ideas for reducing calldata as EIP 4488 introduces a calldata limit. plus with rollups, transactions are getting more calldata heavy anyways.
0x00
.(calldatasize - 4) % 32 == 0
. (If calldata length is multiple of 32 + 4, add a trailing zero byte)()
to represent this.uint256
could bes32
, andbytes
could bed32
.0x0123
would be encoded as0x02
+0x0123
. Worst case is 33 bytes to represent a uint256.0x616263
instead of0x6162630000000000000000000000000000000000000000000000000000000000
(string, string)
("abcd", "efg")
is currently encoded asUnder this proposal,
This has some weird effects like to encode efficiently you might need to evaluate bytestrings backwards. We could get most of the benefit just by allowing dynamic offsets and lengths of the current spec to be encoded using 16-bit integers and drop the zero-padding requirement. This encoding uses 174 fewer bytes, which under current (London) gas costs is 716 gas(!). Under EIP 4488, the encoding would still use 522 less gas.
References
EIP 4488
Copyright
Copyright and related rights waived via CC0