cure-lab / DiffGuard

[ICCV 2023] The official implementation of paper "DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models"
15 stars 0 forks source link

issues on the LDM experiments. #1

Closed fanghenshaometeor closed 9 months ago

fanghenshaometeor commented 9 months ago

The experiments cover the Latent Diffusion Models. However, I would like to raise a question for a discussion.

The training of the LDMs includes 2 steps: Step1. Training a VQ-VAE; Step2. Training a diffusion model in the latent space encoded by the encoder of the VQ-VAE. Regarding to the released checkpoints by the LDM paper, the training of the VQ-VAE is on the OpenImage data set, and then the diffusion models are trained on the specific dataset, e.g., ImageNet. Therefore, the LDMs on ImageNet actually have an implicit prior on more knowledge than the ImageNet data set, since the LDMs are based the OpenImage-trained latent space.

Accordingly, in such an OoD detection task, it might not be appropriate to use such LDMs, since the in-distribution for training the LDMs actually covers more than the ImageNet distribution.

flymin commented 9 months ago

Hi,

When we conducted our experiments, we assumed the ImageNet LDM should use the ImageNet dataset only, while overlooking the detail of VQ-VAE training. I think Table 8 in their paper states the use of OpenImage to train VQ-VAE. It is indeed an issue to use such an LDM in OoD benchmarks.

However, I think this issue may not alter our conclusion with LDM because of the followings:

  1. VQ-VAE is only used for compression while the diffusion process is for the generation. After training on ImageNet, the model is supposed to fit the distribution of ImageNet only.
  2. We use LDM only for conditional generation. The existence of conditions should further weaken the effect of "implicit prior" from partially pretraining.
  3. The performance on OpenImage-O does not show anything particularly outstanding compared with other OoDs.

I think one way to resolve this is to train another LDM only with the ImageNet dataset. Sadly, we do not have the time and resources to do this.

Thank you for raising the issue. I will update the README later to highlight the issue and let other followers notice it.

Best, Ruiyuan

fanghenshaometeor commented 9 months ago

Thanks for your kind reply. It would be much better to add essential explanations in the README to remind other researchers on this issue. Rigorously, both the VQ-VAE and the latent diffusion models should be trained on the in-distribution data.

flymin commented 9 months ago

I agree. I have updated the readme to let others notice the issue with the pre-trained weight for LDM.