Closed gfkri closed 3 years ago
Hi @gfkri !
Thank you for your interest in our work!
This part of code is inherited from TorchSeg. And I admit this is counter-intuitive and may not be the best implementation.
I think the reason is that, when we want to control how many iterations per epoch (see here), this number of iterations may not be equal to the size or original training set. Thus we create an expanded (or shrinked) file list (the _construct_new_file_names
). However, if we use a fixed file list throughout training, this will create bias. That is why we need randomness when we access the expanded (or shrinked) file list.
Hope that helps!
Hi,
thank you for your great work.
If
self._file_length
inBaseDataset
is set (during training), the functionself._construct_new_file_names
is called. It creates a new random order of the file names which then is accessed via the index passed to__getitem__
. This renders the index useless and always returns a random element. Is this wanted? If so, it somehow is very counter-intuitive for a__getitem__
to "use" the index but actually still "ignore" it (and re-creating the list every time), right? Shouldn't_construct_new_file_names
only be called once during__init__
?Thank you very much in advance, Best Georg