Syliz517 / CLIP-ReID

Official implementation for "CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification without Concrete Text Labels" (AAAI 2023)
MIT License
258 stars 41 forks source link

text encoder is not fixed in first stage training #11

Closed mensaochun closed 10 months ago

mensaochun commented 1 year ago

As the paper describe, in first stage the text and image encoder is fixed, only optimize the text tokens. However, in the code, it seems the text encoder is optimized during training. Could I ask if I misunderstand?

Syliz517 commented 1 year ago

Have a look at solver/make_optimizer_prompt.py

Zzhc3321 commented 1 year ago

Have a look at solver/make_optimizer_prompt.py

Encountered an issue while training stage1, 'AssertionError: No inf checks were recorded for this optimizer.' The printed output of loss is not a problem, and the problem may be that it does not cover all backpropagation model parameters. May I ask if the author has encountered this situation before.

Syliz517 commented 1 year ago

Encountered an issue while training stage1, 'AssertionError: No inf checks were recorded for this optimizer.' The printed output of loss is not a problem, and the problem may be that it does not cover all backpropagation model parameters. May I ask if the author has encountered this situation before.

Did you change the model? Probably because the optimizer has no parameters fed into it, only the parameters in prompt_learner are fed into the optimizer in the first stage.