anoma / juvix

A language for intent-centric and declarative decentralised applications
https://docs.juvix.org
GNU General Public License v3.0
442 stars 54 forks source link

Add Bytes builtin and Bytes literal #2865

Open paulcadman opened 3 days ago

paulcadman commented 3 days ago

This issue proposes the introduction of a builtin Bytes type for use by Anoma and the Core evaluator that represents an array of bytes.

This is desirable for the Anoma backend for two main reasons.

  1. We currently represent arrays of bytes as a Nat in the Anoma backend (an integer atom). This is ambiguous, for example the byte arrays { 0x10 }, { 0x10, 0x00 }, { 0x10, 0x00, 0x00 } (and any byte arrays that differ only by trailing zeroed bytes) are all encoded to the same integer (Anoma uses little endian encoding).

  2. We'd prefer to use a more specific type (than Nat) for values that we want to interpret as bytes (in a Juvix program or on the Anoma stdlib APIs). It doesn't make sense for a user to perform arithmetic on a cryptographic signature for example.

In Anoma we have the following builtin:

axiom anomaSign : {A : Type} -> A -> Nat -> Nat; 

This signs a message using a private key, it returns a signed message.

In Juvix programs the private key is provided as a Nat literal, e.g:

privKey : Nat := 0x3181f891cd2323ffe802dd8f36f7e77cd072e3bd8f49884e8b38a297646351e9015535fa1125ec092c85758756d51bf29eed86a118942135c1657bf4cb5c6fc9;

The reason Nat is used is because these data are represented as atoms in the nockma backend.

In Juvix programs we would like a more precise type to represent public, private keys and signatures. In Anoma Node these cryptographic functions actually operate on binary atoms.

Bytes builtin Proposal

The builtin bytes type:

builtin bytes
axiom Bytes : Type;

The type of anomaSign would be:

anomaSign : {A : Type} -> Bytes -> Bytes

We need support for byte literals, the following syntax is proposed:

privKey : Bytes := Bytes#[0x31; 0x81; 0xf8; 0x91; 0xcd; 0x23; 0x23; 0xff; 0xe8; 0x2; 0xdd; 0x8f; 0x36; 0xf7; 0xe7; 0x7c; 0xd0; 0x72; 0xe3; 0xbd; 0x8f; 0x49; 0x88; 0x4e; 0x8b; 0x38; 0xa2; 0x97; 0x64; 0x63; 0x51; 0xe9; 0x1; 0x55; 0x35; 0xfa; 0x11; 0x25; 0xec; 0x9; 0x2c; 0x85; 0x75; 0x87; 0x56; 0xd5; 0x1b; 0xf2; 0x9e; 0xed; 0x86; 0xa1; 0x18; 0x94; 0x21; 0x35; 0xc1; 0x65; 0x7b; 0xf4; 0xcb; 0x5c; 0x6f; 0xc9]

Each element of the Bytes#[] list must be an integer literal (decimal or hex).

It's a compile-time error to use integer literals that don't fit into a byte:

-- error: 300 does not fit into one byte.
err : Bytes := Bytes#[300];

Backend Representation

Core

A new ConstantValue will be added ConstBytes !ByteString.

Nockma

Constant bytes will be transformed to a Nock cell with the head containing the length of the bytes and the tail containing the bytes payload, I.e the integer representing the bytes with little-endian byte ordering.

Calling Anoma stdlib functions in the nockma backend.

Anoma stdlib functions like sign, verify must be passed the the integer representing the bytes payload only (i.e without the length) for public, private key and signature arguments (these have a fixed length). In order to make these calls compatible with the proposed Bytes Nockma representation the payload (i.e the tail of the bytes cell) will be projected before calling the stdlib function.

The sign function returns a bytes payload representing a cryptographic signature. It is known to represent 64 bytes so the backend call will wrap the result in a cell with head value 64.

Native, Cairo

The design of the representation of Bytes is deferred.