Erikcruzk / TRT

The Transformative Repair Tool
Other
1 stars 0 forks source link

Patch information retrieval and processing is missing (for the purpose of patch repalcement and validation) #68

Closed mojtaba-eshghie closed 7 months ago

mojtaba-eshghie commented 7 months ago

Overview

We need to retrieve more specific information about the vulnerable chunk of code so that we can compare and replace it with the patches returned by the LLM:

Detailed algorithm

  1. Identify Surrounding Units:

    • From the analysis results, determine the nearest enclosing units (contract, modifier, function) for each vulnerability.
    • Store details of these units, including their start and end line numbers in the source code.
  2. Apply Patches:

    • For each patch, determine the type of unit it applies to (contract, modifier, function).
    • Match the patch to one of the previously identified units based on its type and location.
    • Replace the original unit in the source code with the patched unit.
    • The rest is similar to previous architecture for patch validation.

Relevant Code Parts

https://github.com/Erikcruzk/TRT/blob/29ba59573787afafca4906962f2ce2c90ff57910/TransformativeRepair.py#L555

https://github.com/Erikcruzk/TRT/blob/29ba59573787afafca4906962f2ce2c90ff57910/PromptEngine.py#L76

mojtaba-eshghie commented 7 months ago

As it seems SB has a problem processing the folder names (perhaps a breaking change in SB recent release?) I have temporarily disabled SB's job in the pipeline;

https://github.com/Erikcruzk/TRT/blob/99cd765b93087844cb7815ba71db6555ad47c6dd/TransformativeRepair.py#L721

The patch generation according to the instructions in this issue works fine according to testing the pipeline on with the following configuration:

# Configuration file for automatic program repair experiments

# Folder name for results
experiment_settings:
  experiment_name: "1-project" # Folder name for results
  delete_old_experiment_name: false
  llm_model_name: "gpt-4-0125-preview"
  vulnerable_contracts_directory: "sc_datasets/DAppSCAN_processed" #"smartbugs_reentrancy_short_no_comments_test" # Folder name for buggy smart contracts
  target_vulnerabilities:
    [reentrancy-benign, SOLIDITY_ARRAY_LENGTH_MANIPULATION]
  n_smartbugs_threads: 60
  n_repair_threads: 10 # n_repair_threads: 10
  # smartbugs_tools: [oyente, slither, confuzzius, conkas, honeybadger, maian, mythril, osiris, securify, sFuzz, solhint]
  smartbugs_tools: [slither, semgrep, smartcheck]
  smartbugs_timeout: 3600
  smartbugs_processes: 11
  patch_examples_directory: "sc_repair_examples"
  prompt_style: "flattened-src---function"
  shave: [comments, NatSpec, file_directives] # sahving configurations
  threshold: 200 # number of tokens that trigger the shaving
  preanalized: True # if the smart contracts have been already analyzed by the sartbugs
  analysis_results_directory: "sc_datasets/1-project" # 
directory with the analysis results if preanalized is True

# LLM model and settings
llm_settings:
  gpt-4-0125-preview:
    model_name: "gpt-4-0125-preview"
    secret_api_key: "..."
    temperature: 0.9
    top_p: 0.3
    num_candidate_patches: 10 # num_candidate_patches: 10
    max_time: 3600
    stop: ["///"]

Subset of vulnerabilities used:

Coinfabrik-Polymath Core Audit.zip