Tact 2.0 RFC - Githubissues

anton-trunov commented 5 months ago

This is a very incomplete draft proposal for the next major version of Tact. Comments are most welcome.

Grammar

ignore ... (or something else) -- useful for parsing incomplete examples in docs, where the triple dots mean "some code skipped";
Do not allow using struct, message, contract, init, bounced, external, primitive, map, and receive, get (or whatever we change those to) as identifiers and make those keywords, e.g. let struct: Int = 0 will become illegal;
Either make the if statement always have parentheses around its condition, or drop parens for while, repeat, do-until;
Make ; a statement separator, not a statement terminator, but allow ; for the last statement in a block;
Commas (,) should separate fields in in struct/message declarations to make it consistent with struct/message definitions;
Compact Int type ascriptions for contract for storage variables: e.g. x : int32 instead of x : Int as int32;
Getter definitions should make it clear that getters are not accessible on-chain (see https://github.com/tact-lang/tact/issues/249#issuecomment-2053648197);
Internal and external message receivers syntax should be more consistent (see https://github.com/tact-lang/tact/issues/9#issuecomment-1999887650);
Writing to storage variables should look differently compared to analogous operations on temporary (stack) variables, e.g. storage_var <- expression;
Contract storage variables should be accessed using a keyword other than self, for example, storage.var (we might even want to hint at it syntactically with storage { var : Type ... } declaration inside a contract);
More concise syntax for map expressions and operations: map literals ({1: "foo", 2: "bar"}), map access (map[key]), map updates (map[key] = val and map[key] += inc, etc.);
Capitalize map type identifier: Map<K, V>;
Equality comparisons via hashing should be explicit. We compare cells and slices via hashing but use == / != which hides this fact. Switching to something like ==# / !=# makes it more obvious;
Syntactic support for sending messages, e.g. instead of send(SendParameters{value: amount, to: self.owner, mode: mode}) we can have something like send {value: amount, to: self.owner, mode: mode};
If a send can actually deploy a contract, it should be evident in the call, something like deploy {...} (e.g. send wouldn't be able to deploy, only deploy could);
sender() in contract initializers (init) should be called deployer();
Remove multi-line /**/ comments in favor of semantic single-line comments, see: https://github.com/tact-lang/tact/issues/249#issuecomment-2102765900

Semantical changes

Disallow using null to mean "empty map" (actually we might want to rethink our design of optionals and remove null completely);
Make == / != only accept expressions of the same type: no more implicit type conversions allowing comparing Int and Int?;
Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;
Disallow map comparisons with == and !=: it should be evident that these are not trivial operations in terms of gas consumption;
Send mode should get its own type, not Int;
Fix SendPayGasSeparately naming, see #149;
Introduce the concept of namespace;
Imports should support namespaces;
Struct and (not yet implemented) enums should create their own namespace;
send should support sending StateInit directly without destructuring it into the code and data parts (this can, for instance, simplify the forward method from the Base trait);
introduce the Time type and possibly TimeInterval type too, plus some primitives for working with absolute time and time intervals;
The now() builtin should return values of the type Time;
Think of a consistent design for contract interfaces, for instance Tact v1.x allows messages, empty receivers and strings to contribute to a contract's interface, thus making it harder to specify the interface in a standalone file (as a set of messages it supports);
Disallow struct fields with default value: everything should be explicit and if there is need to initialize struct fields with the same value for many instances, then it might be a design issue;
Introduce the concept of a reference to pass compound objects to functions without copying;
Module system for better management of imports, which also would help with avoiding name clashes;

FunC, Asm, support and interop

asm blocks support (see here)

Standard library, base trait, built-in functions

BaseTrait's traits forward, notify and reply are not very descriptive and should be renamed;
BaseTrait should be renamed and explicitly used;
We need intuitive names for the most common messages, e.g. returning the excessive amount of gas to the sender (see #358, for instance);

Tooling

Tool to help migrate Tact v1 projects to Tact v2

novusnota commented 5 months ago

Couple of ideas:

https://github.com/tact-lang/tact/issues/9#issuecomment-1999887650
Promote get attribute to be its own function type to make it clear that they're special:

contract Example {
    field: Int;

    // Like this (removed `fun` keyword here):
    get field(): Int {
        return self.field;
    }

    // Or maybe even like this, to make it clear that they're off-chain
    // (and would therefore be called "offchain functions" and not "getter functions"):
    off getField(): Int {
        return self.field;
    }
    offchain getField(): Int {
        return self.field;
    }
}

novusnota commented 5 months ago

Commas (,) should separate fields in in struct/message declarations to make it consistent with struct/message definitions;

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

anton-trunov commented 5 months ago

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

It's fine, though. Syntactically it's included between a pair of curly braces, so there won't be any grammar conflicts. And there is a nice principle saying that declarations and definitions should resemble each other. And it makes it resemble TS more, which is what we target in terms of syntax

anton-trunov commented 5 months ago

Added a couple more bullet points:

Writing to storage variables should look differently compared to analogous operations on temporary (stack) variables, e.g. storage_var <- expression;
Contract storage variables should be accessed using a keyword other than self, for example, storage.var (we might even want to hint at it syntactically with storage { var : Type ... } declaration inside a contract);

anton-trunov commented 5 months ago

More concise syntax for map expressions and operations: map literals ({1: "foo", 2: "bar"}), map access (map[key]), map updates (map[key] = val and map[key] += inc, etc.)

anton-trunov commented 5 months ago

Syntactic support for sending messages, e.g. instead of send(SendParameters{value: amount, to: self.owner, mode: mode}) we can have something like send {value: amount, to: self.owner, mode: mode};

anton-trunov commented 5 months ago

Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;

novusnota commented 5 months ago

Tool to help migrate Tact v1 projects to Tact v2

We can use ast-grep, which uses Tree-sitter to perform AST-based search and replace. Additionally, it can lint code with the visually same error reports like Rust does, provided that we specify linting rules — this can be a starting point for our linters.

Apart from CLI, it also has a Node.js binding @ast-grep/napi with jQuery-like utility methods to traverse and manipulate syntax tree nodes

anton-trunov commented 5 months ago

@novusnota I was thinking we can have a tool that combines Tact v1 parser and Tact v2 code formatter: this way we can migrate contracts if we only have local syntactic changes. Plus we can do some more massaging of the original contract, like explicit map initializations.

novusnota commented 5 months ago

@anton-trunov Oh, so you want for code formatter to not only format, but also perform lint-like fixes? I thought of formatter purely in terms of whitespace fixing, and leaving the actual code replacements to the tact lint --fix or similar

anton-trunov commented 5 months ago

Do not allow using struct, message, contract, init, bounced, external, primitive, map, and receive, get (or whatever we change those to) as identifiers and make those keywords, e.g. let struct: Int = 0 will become illegal;

anton-trunov commented 5 months ago

Oh, so you want for code formatter to not only format, but also perform lint-like fixes?

@novusnota Not at all, I was talking about creating a tool like tact-migrate-v1-to-v2 which would combine the parser for Tact v1 and the future code formatter for Tact v2 + plus some more automatic migrations that are possibly not purely syntactical

0kenx commented 5 months ago

Disallow initializing maps with emptyMap by default, in other words m: map<Int, Int>; without the corresponding entry in init should produce a compilation error;

Why? I have multiple use cases where I need to declare an empty map without adding any entries in init (eg. RBAC).

anton-trunov commented 5 months ago

Why? I have multiple use cases where I need to declare an empty map without adding any entries in init (eg. RBAC).

Semantically nothing will be changed here, the user will have to just explicitly state the initial value for map. This part should be migrated automatically

0kenx commented 5 months ago

Any plans to revive this? https://github.com/tact-lang/docs-obsolete/blob/main/tact-design.md#interfaces

I'd love to see impl MyTrait for MyStruct { }

anton-trunov commented 5 months ago

Yep, we definitely need adhoc polymorphism in Tact

novusnota commented 5 months ago

Not sure here, as this will also influence fields (persistent state variables) in contracts and traits — they use semicolons in their declarations as of now

It's fine, though. Syntactically it's included between a pair of curly braces, so there won't be any grammar conflicts. And there is a nice principle saying that declarations and definitions should resemble each other. And it makes it resemble TS more, which is what we target in terms of syntax

Well, as struct and message are essentially declaring a new type and we're trying to target TS in terms of syntax, we should also consider that in TS type declarations use semicolons (like we do) and not commas, see:

novusnota commented 5 months ago

Albeit it's just a suggestion — do we need a bounced<> type wrapper going forward? Maybe, having the bounced() receiver is clear enough?

novusnota commented 5 months ago

... as struct and message are essentially declaring a new type ...

Regarning struct vs. message — users of Tact may have a confusion between message as type definitions and messages as means of contract interaction on the blockchain.

I'm not sure if that's a valid issue (need to gather community feedback here), but it may be beneficial to merge struct and message definitions under the same name, say, struct or even tuple, to later make it closer to an advanced successor of TVMs tuples.

That way we could clarify things and make it simpler to read and write Tact code. And that would also alleviate the need to capitalize Messages/Structs in the docs to refer to the types and not to the communication itself :)

0kenx commented 5 months ago

I think the current definition of struct and message is clear enough and they offer clear separation of usage scenarios.

If we were to merge the two then what would become message(0x1234) M {}? Does struct(0x1234) M {} even make sense?

andreypfau commented 5 months ago

Equality comparisons via hashing should be explicit. We compare cells and slices via hashing but use == / != which hides this fact. Switching to something like ==# / !=# makes it more obvious;

Maybe better to make like in Kotlin/JS/TS/PHP using === and !=== operator? Its has also ligatures in most of monospace fonts

novusnota commented 4 months ago

Suggestion:

Get rid of multi-line /**/ comments. Instead, use variations of single-line comments //:

//! as the top-level comments describing the current file (may be omitted, I guess)
and /// (or regular //, for super-simplification!) as documentation comments for the lines that follow, as it's done in Zig, Rust, Solidity, and, to some extent, Dart.

Motivation:

Current multiline doc-comments are, in fact, written as if they're multiple single-line comments, because every line is prefixed by *. It's very redundant fact of JSDoc and JavaDoc, which removes the whole point (IMO) of having them over just a bunch of consecutive single-line ones.
It's much easier for users to type and maintain single-line // comments and their series, especially without the advanced help from the editor. More comments, more documentation, more clarity!
It's much easier for us to NOT have multi-line comments, as they're the only token in Tact which can span multiple lines and completely change the list of tokens and parsed AST of the rest of the file as soon as the beginning /* appears. Not having any multi-line tokens would significantly ease the lexing phase from the perspective of incremental updates (we can now lex and re-lex by lines, essentially making way for lexing in-parallel, like Zig does), but this would also improve the performance of lexing and parsing phases! Blazing-fast AST generation for compiler and for tooling.

anton-trunov commented 4 months ago

Great suggestion! Looks like it's about time to move this issue to its own repo and continue discussions in separate issues there. Wdyt?

novusnota commented 4 months ago

Well, yeah, this can be moved to tact-lang/roadmap or to some more RFC-specific repo and then referenced from this issue

novusnota commented 4 months ago

We may try to add direct TVM assembly function wrappers, similar to how it can be done in FunC: RETALT example. This will remove the need to write them in FunC, and then import and wrap in a native function in Tact — we would just write assembly ourselves.

Perhaps, it's best to add another attribute to native functions in Tact, so that we have @asm("...") in addition to the existing @name(func_name_here):

@asm("RETALT")
native returnAlt();

// or without a string literal, similar to @name(...)

@asm(RETALT)
native returnAlt();

// and, perhaps, also allow replacing parentheses () with braces {}
// to write multi-line assembly right between them

@asm{
    <{
    }>CONT // c
    0 SETNUMARGS // c'
    2 PUSHINT // c' 2
    SWAP // 2 c'
    1 -1 SETCONTARGS
} native stackOverflow();

Syntax aspects are up to discussion though, not 100% sure how this should be arranged :)

tact-lang / tact

Tact 2.0 RFC #249

Grammar

Semantical changes

FunC, Asm, support and interop

Standard library, base trait, built-in functions

Tooling

Suggestion:

Motivation: