Update Clarity types to support string-ascii and string-utf8 - Githubissues

stacks-archive / stacks-transactions-js

The JavaScript library for generating Stacks 2.0 transactions

19 stars 17 forks source link

Update Clarity types to support string-ascii and string-utf8 #110

Closed yknl closed 4 years ago

yknl commented 4 years ago

Two additional Clarity value types need to be added to support the change on the blockchain. This is a breaking change to the existing wire format of transactions.

Related PR: https://github.com/blockstack/stacks-blockchain/pull/1779

njordhov commented 4 years ago

As ascii text is valid utf-8, they can be combined into a single value type.

lgalabru commented 4 years ago

Correct, but there is a 4x cost savings when using the ascii class.

njordhov commented 4 years ago

there is a 4x cost savings when using the ascii class.

There are compression schemes like BOCU that can be used for efficient storage of Unicode character points.

When a character set has been declared, as proposed in https://github.com/clarity-lang/reference/issues/19, the encoding can naively be compressed to be bound by the bytes required to encode the character points in the set, avoiding having to pessimistically allocate up to 4 bytes per unicode character.

More efficient compression schemes can also be applied. Perhaps there is an opening to devise a novel scheme tailored to the particular requirements for on chain storage including minimizing the pessimistic estimate.

njordhov commented 4 years ago

The wire format for unicode text should declare the expected character set in addition to the encoding. See clarity-lang/reference#19 for motivation.

njordhov commented 4 years ago

This is a breaking change to the existing wire format of transactions.

Why does this have to be a breaking change?

yknl commented 4 years ago

This is a breaking change to the existing wire format of transactions.

Why does this have to be a breaking change?

It is a breaking change in the wire format. As in it will cause unexpected behaviour in apps that try to decode the transactions due to the addition of 2 new types.

njordhov commented 4 years ago

As in it will cause unexpected behaviour in apps

@yknl Why does adding the two new types have to cause unexpected behavior in apps? Does it perhaps have to do with how the data is encoded in the wire format?

yknl commented 4 years ago

Clarity values are serialized with a 1-byte type ID prefix: https://github.com/blockstack/stacks-blockchain/blob/master/sip/sip-005-blocks-and-transactions.md#clarity-value-representation