Open xiaoouwang opened 3 years ago
Hi Xiaoou WANG,
I am sorry for the inconvenience, we lost the data a while ago but managed to restored a part of it (the stories corpus + train/dev/test) here:
https://drive.google.com/drive/u/1/folders/1yZzwaV8LO1hK8ChIm0sxazXF8BSIZ683
On Wed, Feb 10, 2021 at 1:22 PM Xiaoou WANG notifications@github.com wrote:
Hello Trieu!
Sorry to disturb you here I was desperately trying to find a way to reach you :D
I'm a student in nlp currently working on Winograd and I'm trying to reproduce the results in your paper. I successfully found the code here https://github.com/tensorflow/models/tree/archive/research/lm_commonsenese however all the files on google cloud are no longer available. I'd like to know if you have a backup by miracle...
Sincerely, Xiaoou
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/thtrieu/thtrieu.github.io/issues/20, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGEWB2GYGEUGPQPM7QI4B3S6IQZBANCNFSM4XMLKDOQ .
Tks! It's really a pity...
Hi Trieu,
I'm currently researching a large language model.
I found that Megatron-LM used CC-stories dataset to pretrain their model.
I'm trying to reproduce their pretraining result.
However, currently I cannot find CC-stories dataset in the drive you used in the past.
Could I get CC-stories dataset?
I found the github readme.
But this link does not contain the data anymore.
It would be really helpful to research my work if I could get CC-stories dataset.
Thank you.
Best, Jaewon.
Hello Trieu!
Sorry to disturb you here I was desperately trying to find a way to reach you :D
I'm a student in nlp currently working on Winograd and I'm trying to reproduce the results in your paper. I successfully found the code here https://github.com/tensorflow/models/tree/archive/research/lm_commonsense however all the files on google cloud are no longer available. I'd like to know if you have a backup by miracle...
Sincerely, Xiaoou