Open Stevetich opened 2 months ago
Thanks for pointing that out! I'm not entirely clear on the issue you're describing, though. Can you explain the problem a bit more? I'd really appreciate it if you could help me understand why this operation might be incorrect.
Hi. I have noticed that you adopted the code of VCD as your base code. But I found that they use the same model_kwargs and model_kwargs_cd to generate tokens. I am confused because past_key_values term is also incorporated in model_kwargs, which means the same past_key_values term is used in original and distorted images as the visual inputs. Is that operation correct?