issues
search
togethercomputer
/
RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Apache License 2.0
4.57k
stars
350
forks
source link
Single machine download script and downloaded files check
#60
Open
MIracleyin
opened
1 year ago
MIracleyin
commented
1 year ago
a script that could download files without slurm.
checking downloaded files if the download failed, saving AWS bandwidth cost.