Closed meghbhalerao closed 1 year ago
Hello :)
Firstly, this is only really an issue when you're using a subset of a dataset that has predefined classes (like our imagenet subsets) such that the original labels [0..999] need to be re-mapped to reflect the subset [0..10].
If I remember correctly, this was because the training data have the correct labels assigned to them already.
The validation loader was just made directly from a filtered version of the original validation dataset, to the labels need to be remapped.
This was just a quick hack to make it work, and I never got around to cleaning it up. This could likely easily be done when initially creating the validation loader.
Hope this helps!
Thank you, I think this makes sense! I was just a little confused initially since this part of the code - https://github.com/GeorgeCazenavette/mtt-distillation/blob/main/utils.py#L104 - created the subsets of imagenet using the Subset
class which seems to be filtering based on the original dataset, hence I was a little confused. But nonetheless I understand your point.
You're right, but the training set get re-mapped here (and likewise in buffer.py): https://github.com/GeorgeCazenavette/mtt-distillation/blob/c365f4257117ccc0abf163e09f692be94634ed18/distill.py#L88
It's a bit of a mess 🥲
The whole codebase could definitely use a big refactor.
Ah, okay! That clarifies it! Thank you!
Hello and thank you for the paper and also thank you for open sourcing the code.
I have a question in this line - https://github.com/GeorgeCazenavette/mtt-distillation/blob/main/utils.py#L326
Why is the class index mapping done only for when
mode != 'train'
- should it not be done always irrespective of whether the mode is train or test?Please do let me know if I am missing anything and thank you for your time!
Megh