nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
9.19k stars 713 forks source link

Why 20k+20k equals 38k? Doubts about the trajectory of the dataset! #84

Open ahong007007 opened 1 year ago

ahong007007 commented 1 year ago

1

According to the description of the paper, 20k Alpaca data was used at the beginning, and gradually iterated downwards

image However, when the test was introduced, 38k data appeared instead of 20k+20k data. What data is deleted during this process?