Closed lourencobt closed 2 years ago
This may probably be caused by having only 1 sample in a batch of some datasets. Can you check the number of samples in a batch. Or you can edit ./meta-dataset/data/reader.py in the meta-dataset repository to change dataset = dataset.batch(batch_size, drop_remainder=False) to dataset = dataset.batch(batch_size, drop_remainder=True). (in our work, we drop the remainder such that we will not use very small batch for some domains).
Hope this can help! Wei-Hong
This may probably be caused by having only 1 sample in a batch of some datasets. Can you check the number of samples in a batch. Or you can edit ./meta-dataset/data/reader.py in the meta-dataset repository to change dataset = dataset.batch(batch_size, drop_remainder=False) to dataset = dataset.batch(batch_size, drop_remainder=True). (in our work, we drop the remainder such that we will not use very small batch for some domains).
Hope this can help! Wei-Hong
Well, I did as you said and dropping the remainder solved the problem. However, shouldn't the code execute well independent of the drop_remainder flag?
However, the training of URL continues to be very slow. Is it normal? Can you give an estimative of the time it took you?
Hi, I've updated the code and the problem should be fixed when you don't drop the remainder. As I mentioned in the README, we drop the remainder in our work and I recommend to use the same set up for reproducing the URL results.
In our experiment, it took around 48 hours for learning the URL model. The time cost would depend on the hardwares you use and you should be able to see the estimated time cost in the progress bar. You can download our pre-trained model as well.
Hi, I've updated the code and the problem should be fixed when you don't drop the remainder. As I mentioned in the README, we drop the remainder in our work and I recommend to use the same set up for reproducing the URL results.
In our experiment, it took around 48 hours for learning the URL model. The time cost would depend on the hardwares you use and you should be able to see the estimated time cost in the progress bar. You can download our pre-trained model as well.
Thanks a lot! I will experiment later, but I will reproduce the URL results using the same set.
Relative to the time cost, I needed just an estimative for reference. Thank you.
If I can give a suggestion, you could add to the README the time costs of each training part and the used hardware for reference.
Many thanks for the suggestions!
BTW, the training time details can be found in the supplementary in our paper (in our arXiv version, the training time details can be found in page 17).
Hello,
I've trained the sdl networks from scratch and I was trying to train the URL model from scratch then. The program starts fine, but suddenly it breaks with this error:
I wasn't able to find the problem. Can you help me?
Also, the program execution time is estimated in between 150 to 200 hours. Is it normal? Or there is something happening. I am training with a single GPU Tesla V100S-PCIE-32GB.