Open romain-rsr opened 1 year ago
Hi Romain,
Yes our main interest is obtaining the masks and keypoints by training only on raw images.
I am not sure I understand your request. Our model can be directly applied if you simply resize the 256x256 images to 128x128. If you want to re-train on 256x256 images, you can 1) delete the hard coded architecture in line 149-225 in generator.py to allow it consume 256x256 images.. 2) copy line 41 to line 42 in discriminator.py to allow it consume 256x256 images.
If you have further questions, don't hesitate to reach me.
Bests, Xingzhe
Hi,
I'm sorry i didn't go straight to the point : we run your model successfully on celeba but we failed to apply it on this toy example, where the raw images we train the model on are plain blue rectangles of various sizes and position on a plain grey background :
The model succeeds to generate such shapes but fail into segmenting the blue rectangle from the background.
(the segmentation mask is a tiny point on this one)
---------------------- more info
---------------------- why we asked about the preprocessing first
Since we produced the model input h5 file by applying the celeba processing file on our toy samples, our previous question aimed to verify that the preprocessing step was not in fault. Meanwhile, we actually wrote a generic preprocessing file on our own, that only requires a folder with raw images in it (without any segmentation information). It's available here if required : https://github.com/romain-rsr/colab/blob/main/uprocess.py
Bests, Romain
Hi Romain,
Thanks for this experiment! It is indeed very very interesting! It illustrates some insights I have never thought about.
I also tested by myself on GANSeg, and it didn't work. I think it is due to the (positional encoding or part embedding or background embedding) overfit this too simple dataset.
In parallel, I also tested on two different unsupervised keypoint detection methods, which use keypoint embeddings. They also fail although slightly slightly better than GANSeg, probably because they don't have background embeddings and positional encoding. I doubt if all methods using embeddings without additional care could lead to this problem to some extend.
Finally I tested on a method without using embeddings: https://xingzhehe.github.io/autolink/ and it finally works:
Although it only detect keypoints and their linkages, masks can be easily extracted.
Bests, Xingzhe
Hi,
Thanks a lot for these complementary works on our examples. I focus my experiments on getting the segmentation encoding for a private dataset whom characteristics are halfway between those of the toy dataset and those of highly structured datasets (celeba, flower etc). Since we can't communicate publicly on it, I'll try to find a dataset that share the same problematics (high details and diversity with very low overall structuration)
Bests, Romain
What if the number of points is not confirmed? Does it work?
Hi,
We were able to run your gan on the celeba example but we are struggling to apply it on raw images (without any supervision of mask or keypoint).
After careful reading of your works, we understand the main interest of your model is to generate keypoints then masks then segmented images from training on a dataset with only raw images, without any mask or keypoints in the training set. Can you please confirm this and provide us with a version of your code that can be applied on a folder containing only raw 256x256 images ?
Many thanks again for your work and reactivity, Romain