salesforce / jaxformer

Minimal library to train LLMs on TPU in JAX with pjit().
BSD 3-Clause "New" or "Revised" License
270 stars 35 forks source link

ISSUE: thread source out #5

Closed PoodleWang closed 1 year ago

PoodleWang commented 1 year ago

https://github.com/salesforce/jaxformer/blob/b8b389493b7c641a1d38bbec1b590d9eed3ff3be/preprocess/5_shuffe_concat_eos.py#L67

Hi, this code: len(files) will burn out all threads. Normally the machine has 80,000 threads. But one bucket's file length might be really huge. You can change this to args.works_num

enijkamp commented 1 year ago

Yes, changed. Thanks!