rosinusserrano / pml_vqvae

Repository for the course "Project Machine Learning" during WiSe 24/25 at TU Berlin consisting of a replication of the paper "Neural Discrete Representation Learning" (van den Oord et al., 2018).
0 stars 0 forks source link

Q&A 12.11.2024 #1

Open rosinusserrano opened 1 week ago

rosinusserrano commented 1 week ago

Collection of questions regarding the first milestone.

timonpalm commented 1 week ago

Should we showcase our implementation on either images or audios, or both? I assume we can choose. Prefer images

timonpalm commented 1 week ago

What do they mean with this question?

`What kind of answers could be provided by machine learning algorithms about the data. What are typical queries a human would ask?

Are the dataset overview questions meant in general for the dataset, or are they specific to our paper?

rosinusserrano commented 1 day ago

Why is CIFAR 8x8x10 and imagenet 32x32? Should it not be 8x8x1 but the number can take 10 values?

rosinusserrano commented 1 day ago

Should we rather extend our project to audio or other modalities once finished with the imagenet or should we rather go deep and see what has been developed further in this area

rosinusserrano commented 1 day ago

Is crop+resize feature extraction and/or normalization? Is there Feature Extraction for us or do we not basically throw everything in?

KonstantinAusborn commented 1 day ago

Should we

KonstantinAusborn commented 1 day ago

We argue a basic Autoencoder is a good baseline for efficient Encoding and Image Generation, since it is simple enough to be implemented by us but fulfills the same task. What is our coordinators opinion of that? Do we also need a non-NN Baseline or multiple Baselines, especially for efficient Encoding?

rosinusserrano commented 1 day ago

Paper specific questions (MIT DAVOR GEDANKEN MACHEN!):

rosinusserrano commented 1 day ago

should we code pixelcnn and vimco and vae etc ourselves os ir it okay to take other git repos for that?

rosinusserrano commented 20 hours ago

What is meant with

Are there significant differences between the categories w.r.t. how well they can be characterized?

should we check the paper and look at images class for which it looks better than the others or should this information be read offf from the paper?

and

Can something non-technical be said about what the good features are?

Are "features" here the learnt representations / code embeddings? Or is feature here used like attribute/characteristic. Fundamentally different things but both kind of makes sense to me