Repository for the course "Project Machine Learning" during WiSe 24/25 at TU Berlin consisting of a replication of the paper "Neural Discrete Representation Learning" (van den Oord et al., 2018).
I refactored the dataset classes. The manually parsing of the images folders was meticulous and there were no information about the classes and their labels. Thus, I fall back on the pytorch class, which can read the meta.bin. Otherwise, I would have just copied their implementation for that.
Further, I implemented a function to display some basic stats of the datasets (i.e distribution per class etc.). It is also now possible to create a subset of each dataset with n_samples per class. So the subsets will also always be uniformly distributed among the classes.
I refactored the dataset classes. The manually parsing of the images folders was meticulous and there were no information about the classes and their labels. Thus, I fall back on the pytorch class, which can read the meta.bin. Otherwise, I would have just copied their implementation for that.
Further, I implemented a function to display some basic stats of the datasets (i.e distribution per class etc.). It is also now possible to create a subset of each dataset with n_samples per class. So the subsets will also always be uniformly distributed among the classes.