jd-aig / JAVE

85 stars 29 forks source link

Is Text/Image Encoder frozen? #6

Open icedpanda opened 2 years ago

icedpanda commented 2 years ago

Thanks for sharing the source code.

I have a few questions regards to the input embeddings.

From your code, all text and images are pre-encoded using Bert and Resent. Therefore, I assume both models here are not trainable, and you only use these embeddings as inputs for your cross-modality model. (Please correct me if it’s wrong).

thanks in advance.

AHfair commented 2 years ago

Hello, have you solved the above problems?