DRAGNLabs / 301r_retnet

2 stars 1 forks source link

Multi core dataset processing #35

Closed JacksonSearle closed 4 months ago

JacksonSearle commented 4 months ago

Adds a "num_proc" argument to our yaml file. It is used in download_data.py and tokenize_data.py

Hopefully this won't take too much work to review :)

RLin8103 commented 4 months ago

This looks good to me.