Consider changes to `bytes` types

mr-zwets commented 10 months ago

We should fix the length of bytes types when using split:

bytes2 firstTwoBytes = test.split(2)[0];

The above does not work, when it should always be possible to type the length of the first element of the tuple. It should only be possible to type the length of the second element when the length of the original bytes element is typed.

The typing when getting both elements of the tuple is also wrong:

contract Demo() {
    function example(bytes4 test) {
        bytes3 firstHalf , bytes8 secondHalf = test.split(2);
        require(firstHalf .length == 2);
        require(secondHalf.length == 2);
    }
}

firstHalf can only be bytes2 here, and secondHalf can also only be bytes2, because the original length of the bytes element is typed

rkalis commented 10 months ago

Mathieu and I were talking about this and we think it could make sense to disallow bytesX as function parameters, since these types are not runtime enforced. We could alternatively also add an option to enforce these types by injecting require(x.length === y) statements. This is something we should consider in more detail later.

rkalis commented 10 months ago

Good to note that there would be no way to runtime enforce constructor parameter types, because at "construction" time, no script gets executed because of the way P2SH works.

mr-zwets commented 3 weeks ago

I like the idea of disallowing bytesX as function parameters better than adding extra compiler checks without the user being informed. However other bytes types also have semantic meaning: pubkey sig and datasig but these don't make any exact claims about their size (I think).

The original issue was mainly focused how bytesX typing works incorrectly with .split()

mr-zwets commented 2 days ago

BytesX type casting

Disallowing bytesX types in function arguments would mean bytesX typecasting will be more common, so we'd need to improve our docs.

I had to reach out to find out how this works internally:

bytes to bytes casting is just semantic, not doing something like padding. That's intended, but might be good to document better.

the main places where specific byte lengths matter is for lockingBytecodes

// these two global function need bytesX inputs
new LockingBytecodeP2PKH(bytes20 pkh): bytes25
new LockingBytecodeP2SH32(bytes32 scriptHash): bytes35

mr-zwets commented 2 days ago

Rational for properly typed .split()

State kept in an nft commitment is often fixed length to be easy parsable (although the last item could be allowed to be variable length). On state transitions, this state is read, chopped up (split) and reconstructed.

Let's say my commitment is always 16 bytes consisting of two bytes8 items, only one of which was mutated. When getting the 1st item of .split() we should never need typcasting

bytes oldState = tx.inputs[0].nftCommitment;
// Typing the first item of .split()
bytes8 item1 = oldState.split(8)[0];
bytes8 newItem2 = bytes8(someIntegerValue);
bytes16 newCommitment = item1 + newItem2;

When getting the 2nditem of .split() we should don't need a typecast is the length of the original bytestring is known

bytes16 oldState = bytes16(tx.inputs[0].nftCommitment);
bytes8 newItem1 = bytes8(someIntegerValue);
// typing the second item of .split()
bytes8 item2 = oldState.split(8)[1];
bytes16 newCommitment = newItem1 + item2 ;

Currently we always need a bytesX typecast when using .split()

bytes oldState = tx.inputs[0].nftCommitment;
// Unnecessary bytesX typecast
bytes8 item1 = bytes8(oldState.split(8)[0]);
bytes8 newItem2 = bytes8(someIntegerValue);
bytes16 newCommitment = item1 + newItem2;

CashScript / cashscript

Consider changes to `bytes` types #178

BytesX type casting

Rational for properly typed .split()