Victorwz / LongMem

Official implementation of our NeurIPS 2023 paper "Augmenting Language Models with Long-Term Memory".
https://arxiv.org/abs/2306.07174
Apache License 2.0
757 stars 68 forks source link

Pile dataset is no longer available.. #16

Closed Yuzz1020 closed 11 months ago

Yuzz1020 commented 1 year ago

Hi, thank you for the awesome work!

However, when I tried to run some experiments with your code, I noticed that the Pile dataset is no longer available on their website. I'm wondering if you know any alternative methods to get this dataset.

Thank you for your help!

anthonyprinaldi commented 11 months ago

Or, alternatively, is there a link you can share with us to download the PILE dataset other than the original source?

Victorwz commented 11 months ago

Hi, all. I also found this issue that the Eleuther AI just put the Pile dataset offline. But I found that there is still an available resources on Huggingface dataset. Please check: https://huggingface.co/datasets/EleutherAI/raw_deduplicated_pile/tree/main