Closed jpcorb20 closed 5 years ago
as albert already achieved state of the art performance of main english benchmarks, i think authors of the paper will release english revision in the near future.
Yes, it is true. I haven't found any info on a release yet. Maybe, I should train one myself. Thank you very much.
thank you. where can i find english corpus that used in the papers?
If I am not mistaken, it is the same as BERT. It is BookCorpus and English Wikipedia (the pre-processing here https://github.com/attardi/wikiextractor).
Awesome work by the way to reproduce ALBERT.
great. thanks.
Hi , I am also interested in this idea, by the way do u have bookcorpus data?
were you able to find bookcorpus using jpcorb20's link above?
No, but I ‘m curious about that. Bc I try to find it. But I didn’t get sth link
You are right, sorry, the link inside is down. At the moment, the only thing I found is this library to crawl the original website: https://github.com/soskek/bookcorpus ...
ok, thanks. I just want that dataset XD
Hello,
Are you planning on releasing English pre-trained versions of Albert in the future?
Thank you,