wl-zhao / VPD

[ICCV 2023] VPD is a framework that leverages the high-level and low-level knowledge of a pre-trained text-to-image diffusion model to downstream visual perception tasks.
https://vpd.ivg-research.xyz
MIT License
500 stars 30 forks source link

miou50%,macc50% #62

Open Saillxl opened 3 months ago

Saillxl commented 3 months ago

Excellent work, I used the segmentation code to train my binary classification dataset, but why did the loss drop to 0 after 25,000 iterations, while the average metric remained at 50%? Has anyone encountered such a situation before?

wl-zhao commented 3 months ago

If your dataset is too small, training the whole diffusion U-Net may suffers from overfitting.

Saillxl commented 3 months ago

Thank you for your prompt response. I have a training set of 120,000 images, and I have also converted the masks to binary format (0, 1).