huseyinatahaninan / Differentially-Private-Fine-tuning-of-Language-Models

61 stars 13 forks source link

Custom dataset preprocessing using fairseq #5

Open MarkDeng1 opened 2 months ago

MarkDeng1 commented 2 months ago

I want to use my own 'xxx.jsonl' file as the training data.

Could you please indicate how did your process your SST-2-bin before?

Thanks, Mark

dayu11 commented 1 month ago

Hi Mark,

Thank you for your question! We use this script from the Fairseq repo for preprocessing the data.

Best, Da