[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)
Reproduction
When I tried to run run_clm.py file from Language Modelling with llama3.1 model, I am running into a error saying num_samples=0
[rank6]: File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 143, in init
[rank6]: raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}")
[rank6]: ValueError: num_samples should be a positive integer value, but got num_samples=0
when I tried to do the same with gpt2-xl it is working
System Info
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
When I tried to run run_clm.py file from Language Modelling with llama3.1 model, I am running into a error saying num_samples=0
[rank6]: File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 143, in init [rank6]: raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}") [rank6]: ValueError: num_samples should be a positive integer value, but got num_samples=0
when I tried to do the same with gpt2-xl it is working
Expected behavior
Pretraining of Llama model