VITA-Group / FasterSeg

[ICLR 2020] "FasterSeg: Searching for Faster Real-time Semantic Segmentation" by Wuyang Chen, Xinyu Gong, Xianming Liu, Qian Zhang, Yuan Li, Zhangyang Wang
MIT License
524 stars 107 forks source link

BaseDataset.__getitem__ always returns random item if file_length is set #55

Closed gfkri closed 3 years ago

gfkri commented 3 years ago

Hi,

thank you for your great work.

If self._file_length in BaseDataset is set (during training), the function self._construct_new_file_names is called. It creates a new random order of the file names which then is accessed via the index passed to __getitem__. This renders the index useless and always returns a random element. Is this wanted? If so, it somehow is very counter-intuitive for a __getitem__ to "use" the index but actually still "ignore" it (and re-creating the list every time), right? Shouldn't _construct_new_file_names only be called once during __init__?

Thank you very much in advance, Best Georg

chenwydj commented 3 years ago

Hi @gfkri !

Thank you for your interest in our work!

This part of code is inherited from TorchSeg. And I admit this is counter-intuitive and may not be the best implementation.

I think the reason is that, when we want to control how many iterations per epoch (see here), this number of iterations may not be equal to the size or original training set. Thus we create an expanded (or shrinked) file list (the _construct_new_file_names). However, if we use a fixed file list throughout training, this will create bias. That is why we need randomness when we access the expanded (or shrinked) file list.

Hope that helps!