Closed BakingBrains closed 2 years ago
Can you please suggests, how can I prepare the dataset for code geenration task? or the data is prepared as same as for text generation task?.
It is prepared the same as the text generation task. You may find improved performance using a customized tokenizer though, as normal text tokenization does not particularly support the syntax of code well.
Can you please suggests, how can I prepare the dataset for code geenration task? or the data is prepared as same as for text generation task?.