Open Andyfever123 opened 4 days ago
Hello @Andyfever123, if you are using only the example data, and set the --threshold argument 10 (this determines the min number of repeated tokens to put them into tokenizer), this is expected as tokens are not repeated in the example data more than 10 since it is very limited. It was to show the required dataset structure not to train the model.
When I was reproducing, why was everything in the output res.csv and gts.csv files,just like this: