crytic / tealer

Static Analyzer for Teal
GNU Affero General Public License v3.0
62 stars 14 forks source link

Initial design for support of group-transactions analysis #165

Open S3v3ru5 opened 1 year ago

S3v3ru5 commented 1 year ago

First task of issue #83

Data Model

class Instruction:
    prev: List[Instruction]
    next: List[Instruction]
    line_num: int
    source_code_line: str
    comment: str
    comments_before_ins: List[str]
    tealer_comments: List[str]
    bb: Optional["BasicBlock"]
    supported_version: int
    supported_mode: Mode.Application | Mode.LogicSig

class BasicBlock:
    instructions: List[Instruction]
    prev: List[BasicBlock]
    next: List[BasicBlock]
    idx: int
    subroutine: Optional["Subroutine"]
    tealer_comments: List[str]

class Subroutine:
    subroutine_name: str
    entry: BasicBlock
    basicblocks: List[BasicBlock]
    exit_blocks: List[BasicBlock]
    contract: Optional["Teal"] = None

class Function:
    cfg: List[BasicBlock]
    function_name: str
    contract: Optional["Teal"]

class Teal:
    contract_name: str
    version: int
    execution_mode: ContractType
    instructions: List[Instruction]
    basicblocks: List[BasicBlock]
    main: Subroutine
    subroutines: Dict[str, Subroutine]
    functions: Dict[str, Function]

class ContractType(Enum):
    LogicSig
    ApprovalProgram
    ClearStateProgram

CFG of the contract is divided into subroutines. Every subroutine is an independent CFG. The blocks of one subroutines are not connected to blocks of any other subroutine. In the CFG, The callsub instruction is connected with immediately next instruction (the return point/address of the called subroutine.)

The contract needs to be divided into "Functions/Operations" as well. Functions are equivalent to ARC-4 methods. Every function should have their own CFG for analysis.

The CFG containing the basicblocks that are not part of any of the subroutines is referred to as "contract-entry" CFG. The execution always starts at the entry block of this CFG. If we consider every subroutine as external contract/program then "contract-entry" CFG can be considered as the CFG of the entire contract.

If the method/function dispatcher used by the contract does not touch any of the subroutines, each function can be represented by a CFG that consists of basicblocks that are related to that particular function.

In common use cases, method dispatcher is not part of any subroutine. The method dispatcher generated by PyTeal's Router class also does not touch the subroutines. Under this assumption, The "contract-entry" CFG can be divided into individual function CFGs. Functions can be analyzed independently.

A block might be part of multiple functions. In that case, Every function should have their own copy of the block.

class Transaction:
    type: TransactionType
    logic_sig: Optional[Function]
    application: Optional[Function]
    logic_sig_group_context: Optional[Dict[Instruction, Transaction]]
    application_group_context: Optional[Dict[Instruction, Transaction]]

class TransactionType(Enum):
    Payment
    AssetConfig
    AssetTransfer
    AssetFreeze
    KeyReg
    ApplicationCall

A transaction may involve execution of both a LogicSig and an Application.

The group-context information contains a mapping from every group related instruction of a function to the Transaction object.

group related instructions:

group-context of "gtxn 1 RekeyTo" would point to the Transaction object representing the transaction at index 1.

The Transaction object for an instruction is specified for each execution of the function.

A group of transactions may involve execution of same function of a program in multiple transactions Or a block of code can be executed for two different functions and both functions are part of a group. So, The group-context should be provided per (Transaction, Function) because a group-context instruction can refer to different Transaction object for each of the execution.

At the same time, It is not possible to provide a Transaction object for instructions which are executed multiple times in a given execution. If a gtxns f instruction is used in a loop with different transaction indices then the Transaction object will be different for each iteration. These kind of instructions are ignored for the analysis.

class GroupConfig:
    transactions: List[Transaction]
    tealer: Optional[Tealer] = None

class Tealer:
    contracts: List[Teal]
    group_configs: List[GroupConfig]

Detector API

class AbstractDetector:

    def __init__(self, tealer: Tealer):
        pass

    def detect(self):
        pass

The detectors can be classified into two classes:

Type 1 detectors will have to use the Tealer.contracts to access the individual contracts and run analysis on each of them. Type 2 detectors will have to use the Tealer.group_configs to access the transaction configs and run analysis on each group of transactions.

Output format:

TBD

S3v3ru5 commented 1 year ago

User Configuration

Contract = {
    /* name of the contract, e.g pool. Every contract should have a unique name */
    "name": string;
    /* filesystem path of the contract, (relative path)*/
    "path": string;
    /* Type of the contract: one of LogicSig, ApprovalProgram or ClearStateProgram*/
    "type": string;
    /* Contract's teal version */
    "version": int;
    /* Names of subroutines present in the contract */
    "subroutines": string[];
    /* Functions/User operations */
    "functions": Function[];
}
Function = {
    /* execution path to reach the function's entry block.
        The execution path is part of the method dispatcher CFG.
        The execution path is array of strings. For example, ["B0", "B1", "B3", "B4"]
        The basic blocks "B0", "B1", "B3", "B4" are part of the method dispatcher. The code in these blocks check for function identifier and route to the function accordingly.
        The block "B4" is start of the function code.
    */
    "execution_path": string[];
    /* Name of the operation, function. used as identity in "group_configurations". Should be unique for a contract. */
    "function_name": string;
}
Transaction = {
    /* A unique id for this transaction. The id is only used to refer this transaction in other transactions of the group configuration. Example: "T1" */
    "tx_id": string;
    /* Type of the transaction: one of "pay", "keyreg", "acfg", "axfer", "afrz", "appl" or "txn".  "txn" can be used to represent any type of transaction" */
    "txn_type": string?;
    /* if the transaction is to be signed with a LogicSig, specify the contract name and the function name */
    "logic_sig": {
        "contract": string;
        "function": string;
    }?;
    /* if the transaction is an application call. specify the contract and the function being called */
    "application": {
        "contract": string;
        "function": string;
    }?;
    /* Transaction's index in the group. if the transaction MUST be present at a predefined index in the group and contracts in the group use an absolute index to access fields of this transaction then specify that index in this field. if the transaction is always the first transaction 
    in the group then the "absolute_index" should be `0`.
    */
    "absolute_index": int?;
    /* Relative index of other transactions from this transactions in the group. The relative index specified are predefined and static. The relative index should not depend on any other runtime information, for example, on application arguments. Such relative indices are not completely supported. Not supported completely in the sense that if the contracts in this transaction perform validations on that transaction, Tealer will not be able to consider these validations when analyzing that transaction. 
    */
    "relative_indexes": [
        {
            /* id specified in the Transaction "id" field. Need a better field name */
            "other_tx_id": string;
            /* relative index of "other_tx_id" transaction from this transaction.
            For example, if "other_tx_id" transaction must preceed this transaction then relative index is `-1`.
            The contract executed in this transaction will access "other_tx_id" transaction using "(Txn.GroupIndex) - 1" */
            "relative_index": int;
        }?;
    ]?;
}
GroupConfig = Transaction[]

UserConfig = {
    "contracts": Contract[],
    "group_configurations": GroupConfig[],
}
montyly commented 1 year ago
S3v3ru5 commented 1 year ago

Example config:

name: protocol name
contracts:
  - name: contract1
    path: contracts/contract1.teal
    type: LogicSig
    version: 6
    subroutines:
      - sub1
      - sub2
    functions:
      - name: function1
        execution_path: [B0, B1, B2]
        entry: B2
        exit: [B10, B11]
      - name: function2
        execution_path: [B0, B1, B3]
        entry: B3
        exit: [B12, B13]
  - name: contract2
    path: contracts/contract2.teal
    type: ApprovalProgram
    version: 6
    subroutines:
      - opt
      - delete
    functions:
      - name: init
        execution_path: [B0, B1]
        entry: B1
        exit: [B13]
      - name: clear
        execution_path: [B0, B2]
        entry: B2
        exit: [B13]
groups:
  - - txn_id: T1
      txn_type: pay
      logic_sig:
        contract: contract1
        function: function1
      absolute_index: 0
    - txn_id: T2
      txn_type: appl
      application:
        contract: contract2
        function: init
      absolute_index: 1
  - - txn_id: T1
      txn_type: axfer
    - txn_id: T2
      txn_type: appl
      logic_sig:
        contract: contract1
        function: function2
      application:
        contract: contract2
        function: clear
      relative_indexes:
        - other_txn_id: T1
          relative_index: -1