Open punisher1k opened 2 months ago
Hi, were you able to run this code as provided or did you have to make few changes/debugging? Also did you run it on a single or multi GPUs? Thanks!
BTW I think your plots show the log perplexities and not the actual perplexities which would be why you have smaller values.
Yes, I reproduced the exp using the exact config but my obtained Github domain weight is often large (>.5). Did you observe a similar parttern?
I had to make few debugging on the data processings but finally were able to run code. I am just using the provided domain weights at the moment so didn't get to make that observation yet. Were you able to run it on multi GPU? Thanks!
Yes, just run the code as it is and accelerate will take care of that
Hi @Olivia-fsm, thank you for sharing your great work.
I am running your provided code on the 6B version of the SlimPajama data set and obtained <= 7. ppl across all domains. I am wondering if this is an unexpected behavior (your reported ppl for C4 and CC are >40).
Besides, could you please share your *bin files for the 627B SlimPajama? The processing time for the dataset is too long is my machine so it would be great if you could also share them with me.
Thank you in advance.