thautwarm / DianaVM

Diana... 🥳🥳🥳Diana, suki🤤🤤🤤
BSD 3-Clause "New" or "Revised" License
11 stars 0 forks source link

Bytecode specification #2

Open thautwarm opened 3 years ago

thautwarm commented 3 years ago

Bytecode specification: https://github.com/thautwarm/DianaScript/blob/master/src/Parser.cs#L161

encoding of code objects:

    string filename,
    string name
    bool varg
    int narg
    int nlocal
    int nfree
    string[] strings
    DObj[] consts
    (int, int)[] locs
    int[] bc

encoding of ints, floats: 4 byte. encoding of strings: 4 byte head indicates the length N, the left N * 2 bytes for the string contents(use UTF-16 encoding).

thautwarm commented 3 years ago

encoding of dict:

| keyobj1 | valueobj1 | ... | keyobjn | valueobj n|

thautwarm commented 3 years ago

The special encoding is for bool and None:

bool : true = 0b10000011 , false = `0b10000010. None: 0b10000000

thautwarm commented 3 years ago

Bytecode encoding:

no arg instrs: 1 byte instrs with arg: 1 byte for instr, 4 byte for operand(int)

see details at https://github.com/thautwarm/DianaScript/blob/012a7eba2b267f7b63405a6ebf96dfe6c4fe6b93/src/Parser.cs#L129-L136

thautwarm commented 3 years ago

The specification has changed a lot. The new representation is a hybrid version of bytecode and optimised flatten AST.

thautwarm commented 3 years ago

An AST is stored as a Ptr: https://github.com/thautwarm/DianaScript/blob/69133de17684c2bc9293a851fa0381406a5c7581/dianascript/code_cons.py#L78

but it's not a native pointer, instead it's a (int8, int56). The first 8 bits are used for AST tag(indexing an array storage of AST data of specific type), the second 56 bits are used to position the data from the storage.