ton-blockchain / ton

Main TON monorepo
Other
2.91k stars 863 forks source link

TLB Improvements #736

Open tvorogme opened 1 year ago

tvorogme commented 1 year ago

Cool improvements that will help development might be:

  1. Allow int / uint constraints (the more complex task can be usage of Int256 in codegen and allow them as NAT)
    a$_ a:uint32 b:uint32 { a <= 10 } = A;
  2. Allow named lists refs Currently allowed to do: test# a:^uint32 = Test; and test# _:^[ a:uint32 b:uint64 ] = Test; but not test# a:^[ a:uint32 b:uint64 ] = Test;
  3. Add else realization in E ? E [ : E ]
  4. Allow dependency on other tlb files
  5. Allow bit selection operator in constraints
  6. Allow bit selection for any type
  7. Don't allow ambitious bit selection
    _ a:(## 2) b:(a . 1000)?(## 32) = Example;
  8. Add FixedCellChain X as built-in type (chain of X cells concatenated to one bitstring) and CellChain X (chain up to X cells) This means that sometimes we want to codegen sum of several Cells (text, on-chain images, ...)
  9. Add more strict rules for parametrized variables:

    _ a:(## 8) { b:# } { ~b = a + 10 }
    bits:(## b) = B_Calc_Example;

    Will generate:

    cb.store_ulong_rchk_bool(a, 8)
      && (b = a + 10) >= 0
      && cb.store_ulong_rchk_bool(bits, b);

    But it'll be cool not to have Out of cell so good result can be:

    cb.store_ulong_rchk_bool(a, 8)
      && (b = a + 10) >= 0 
      && (1015 - b) >= 0
      && cb.store_ulong_rchk_bool(bits, b);

    (Where 1015 is remaining bits in cell)

  10. Add built-in CellBits type, means the rest of cell bits (without refs)
  11. Allow type negate
  12. Allow type check
  13. Allow type tag check 11-13:

    simple$01 tmp:# = NFT;
    dimple$10 tmp:# = NFT;
    
    wallet$01 {wallet:Type} wallet:NFT = NFTWallet ~wallet;
    
    simpledimple$_ {wallet:Type} ok:(NFTWallet ~wallet)
                is_nft:((wallet is NFT)?(## 1))
                is_simple((wallet is NFT.simple)?(## 1))
  14. Allow complex constants _ {a:#} b:# c:# { ~a = (c * b) } = NegateSimplePlus;

Some fixes can be made:

  1. Usage of large integers (## for example) with more than 64 bits in codegen cpp can crash program. U can geneate & compile, but use of generated pack / unpack method will crash program
  2. Fix Any in codegen cpp (will generate cpp, but not compile)

    test$1 = Unvalid 0;
    _ (Unvalid Any) = UnvalidAny;
  3. Fix check_constraint_incomplete for E ? E [ : E ] (will generate cpp, but not compile)
    _ a:(## 10) b:(## 10) { a ? (b = 100) } = Flg;
  4. Ambitious bit selection will crash program
  5. Fix negate (will crash):
    _ n:(## 4) = Define ~n;
    _ {n_from_define:#} defined_val:(Define ~n_from_define) real_value:(## n_from_define) = Example;
tvorogme commented 1 year ago

Usage of up to UInt256 as nat type in TLB will allow us to do:

_ {b:(## 256)} a:(#< b) = B b;
_ a:(## 256) _:(B a) = A;

Currently, you can do only:

_ {b:#} a:(#< b) = B b;
_ a:(## 256) _:(B a) = A;

And this leads to limitation of b to simple int32. This will compile, but will fail in serialization during int32 limitation.

tvorogme commented 1 year ago

Usage of large >32 bit tags for hex constructor generate not buggy CPP codegen.

Example:

bool_false#000000032 = MyTag;
bool_true#000000064 = MyTag;

This TLB generates:

int MyTag::get_tag(const vm::CellSlice& cs) const {
  switch (cs.bselect(6, 3)) {
  case 0:
    return cs.bit_at(29) ? bool_true : bool_false;
  default:
    return -1;
  }
}

Which is not correct.

tvorogme commented 1 year ago

Chain of conditional fields convert type to subslice, which is not very useful

_ a:(## 32) b:(## 8) c:(## 1) d:(## 1) e:((a . b)?(c?(d?(## 64)))) = A;
_ a:(## 32) b:(## 8) c:(## 1) d:(## 1) e:(d?(## 64)) = B;

In B type e field is int:

struct B::Record {
  typedef B type_class;
  int a;    // a : ## 32
  int b;    // b : ## 8
  bool c;   // c : ## 1
  bool d;   // d : ## 1
  int e;    // e : d?(## 64)
  Record() = default;
  Record(int _a, int _b, bool _c, bool _d, int _e) : a(_a), b(_b), c(_c), d(_d), e(_e) {}
};

But in A type e field is subslice:

struct A::Record {
  typedef A type_class;
  int a;    // a : ## 32
  int b;    // b : ## 8
  bool c;   // c : ## 1
  bool d;   // d : ## 1
  Ref<CellSlice> e;     // e : a.b?(c?(d?(## 64)))
  Record() = default;
  Record(int _a, int _b, bool _c, bool _d, Ref<CellSlice> _e) : a(_a), b(_b), c(_c), d(_d), e(std::move(_e)) {}
};
tvorogme commented 12 months ago

Sometimes we want to optimize load of TLB structures by add optional refs or bits, but having same tag. As example we can look in DNS change record:

change_dns_record#4eb1f0f9 query_id:uint64 key:uint256 value:^DNSRecord = InternalMsgBody;

Real implementation contains optional value ref load, but in TLB scheme it's not reflected.

According @mr-tron realization can be done with ? value separator.

change_dns_record#4eb1f0f9 query_id:uint64 key:uint256 value:^DNSRecord? = InternalMsgBody;

All fields in right side of ? value separator must contain ? separator, and must be optional to load.

So if you want to have several optional values, you might write such a scheme:

_ a:uint32 b:uint64? c:^MyObj? = Example;

mr-tron commented 12 months ago

My original proposal was to add one ? in declaration. Like change_dns_record#4eb1f0f9 query_id:uint64 key:uint256 ? value:^DNSRecord = InternalMsgBody; and all fieds on the right side are optional. But it maybe not good for language. I am not specialist in programing languages theory.