AML14 / tratto

2 stars 1 forks source link

TRATTO

Build Status Azure DevOps coverage

TRATTO stands for "TRAnsformer-based Token-by-Token Oracle generation" and is an Italian word that means “line” or “way”. TRATTO generates oracles token by token, in a grammar-directed way, but supported by a neural module to guide the search of oracles toward optimal solutions.

Directory structure

  1. oracle-grammar/ is the folder containing the Xtext project for the grammar of the oracles generated by Jdoctor.
  2. ml-model/ contains all code related to the ML part of this project. This may include different things, like a program for pretraining a transformer on English language and source code, a program for fine-tuning the model with the tokens dataset, a program for predicting an output label given an input, etc.
  3. tratto/ contains the main implementation of the project, including: 1) the program for creating the oracles dataset; 2) the program for creating (transforming) the tokens dataset (from the oracles dataset); 3) the programs for augmenting the oracles dataset; and 4) the E2E program for generating oracles, generally referred to as TRATTO.

Conventions