Closed jsg921019 closed 6 days ago
We have an internally released version supporting webdataset format data. What do you mean by how to handle 10M scaled dataset?
I meant how do you train with data that is large as 10M. So I assume you have trained with webdataset with internally realeased code? I have another question. Is there any specific reason why grad clip is set very low? (0.01)
So I assume you have trained with webdataset with internally realeased code?
We train the released specific dataset and the webdataset both. Got similar results.
I have another question. Is there any specific reason why grad clip is set very low? (0.01)
For someone who got unstable results (NaN) during training. If you don't have the issue, feel free to enlarge it to 0.1 or 1. Both are ok.
Thanks for the great work. I have a question about the dataset format. I noticed in the training script, the data is loaded from json file. Have you tried loading from the webdaset format (tar files)? How did you handle ~10M scale dataset?