THUDM / MathGLM

Official Pytorch Implementation for MathGLM
315 stars 24 forks source link

Data Generation Code #3

Open muelletm opened 1 year ago

muelletm commented 1 year ago

Hi!

Thanks for releasing this!

From what I saw you did not release the code that was used to synthesize the pre-training data. Is that correct? Are you planning to release it?

Thanks a lot!

shuxiaobo commented 12 months ago

sampe questoin of data generation The pretrain code split the data line by chat 's', but no 's' in the release data