Open muradtuk opened 1 year ago
Hi @muradtuk ! I might be misunderstanding your question, but setting the traversal_order property isn't recommended, and the proper way to take subsets of the dataset is to just pass the indices= flag when creating a Loader. We haven't tested with changing indices during training - if you don't need to do it too many times I might suggest just re-creating the loader object from scratch.
According to https://github.com/libffcv/ffcv/issues/152#issuecomment-1041024724, to change the indices during training it is enough to update both the indices
attribute of the loader and the indices
attribute of the transversal_order
:
loader.indices = new_array_of_indices
loader.traversal_order.indices = new_array_of_indices
Does this make the code slower as well?
Dear authors,
I have been using your framework to train on subsets of data, specifically for the ImageNet dataset using your code. Since the training is done via QUASI_RANDOM ordering, when taking a subset of the data, I had to define a Numpy array of indices, used for changing the
train_loader.indices
. Going over some of the raised issues from the past, I have found that it is best to also change the traversal_order object, i.e.,train_loader.traversal_order = QuasiRandom(train_loader)
to obtain the correct subset in mind.Such step, i.e.,
train_loader.traversal_order = QuasiRandom(train_loader)
makes the code so much slower than it should be. How can I solve this problem?Please advise.