bigscience-workshop / data_tooling

Tools for managing datasets for governance and training.
Apache License 2.0
74 stars 48 forks source link

[WIP] add slurm and python files to extend the pseudo crawl dataset with the seeds of batch 2 #385

Closed SaulLu closed 2 years ago

SaulLu commented 2 years ago

close in favor of #386