Thanks a lot for releasing the code publicly, it is very helpful and clean.
I have a small doubt about implementation of the data loader, specifically, this portion of code:
It seems to me that because the batches for labeled and unlabeled images are filled separately and you loop through both datasets simultaneously using zip, sooner or later some images will be "skipped" from training.
There seem to exist theoretical worst cases where this issue would cause half of the images in both datasets to be skipped, but in reality, when the numbers of images with either aspect ratios are almost equal (?), and because the data loaders loop through images infinitely, this probably has a negligible impact on experiments (?).
Just wanted to bring this up or please let me know if I'm wrong.
One idea to properly implement this is to use two separate iterators that lazily fetch images from datasets individually.
Thanks a lot for releasing the code publicly, it is very helpful and clean.
I have a small doubt about implementation of the data loader, specifically, this portion of code:
It seems to me that because the batches for labeled and unlabeled images are filled separately and you loop through both datasets simultaneously using
zip
, sooner or later some images will be "skipped" from training.There seem to exist theoretical worst cases where this issue would cause half of the images in both datasets to be skipped, but in reality, when the numbers of images with either aspect ratios are almost equal (?), and because the data loaders loop through images infinitely, this probably has a negligible impact on experiments (?). Just wanted to bring this up or please let me know if I'm wrong.
One idea to properly implement this is to use two separate iterators that lazily fetch images from datasets individually.