Open faysalhossain2007 opened 2 years ago
@faysalhossain2007 were you able to run the model for training from scratch?
@nashid I used their trained mode to test on their dataset.
@faysalhossain2007 please check data/data/prepare_testing_data.py
, which is the script to prepare test input for new data.
If your ground-truth is inserting a new statement, you could use the line after the insertion as the original buggy line, and expect the model to output patches that is a new statement followed by the buggy line. But CURE's performance on such insertion bugs are not as good since such cases is very rare in our training data. We are building other tools to address this limitations.
I am trying to test your model on our dataset. I am able to generate
training_bpe.txt
andtraining_tokenize_sard.txt
file. But I am facing some issues while generatingidentifier.tokens
andidentifier.txt
file.Can you please share the script for generating those files please?
Also, my data contains some patches which is a new if-block with else-block (multi statement). In that case, how should I add the patch in the input ground truth data?
Thanks for your help!