Closed qylen closed 5 months ago
Also, How to deal with the input dataset file like openwebtext2? you say you provide the method for data processing recipes, but I can't find it.
Dear qyleni,
Thanks for your attention to our work.
In terms of your first question, as no further details about the errors of running Yulan-GARDEN are provided, I suppose the mini openwebtext2 you downloaded is not in the format of JSONL. To specify the {{input_ext}} in config file, you are supposed to serve each data point in one line, while a json dict containing {{input_text_key}} field in one line.
As for your second question, we provide our data processing recipe for openwebtext2 to reproduce the experiment result reported in our paper can be found in here.
I wish this response could help your figure out your issues.
Emanual20, Yulan-GARDEN Team
As no further comments are addressed, this issue will be closed. Please feel free to reopen it if you have any other questions. We are looking forward to your feedback.
Hello, your work is so good! I'm a newcomer in this field. While I follow your guide to run the code like this , I download the mini openwebtext2 for input. when I run the code, the output file is empty. I don't know how to solve it.