OFA-Sys / Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
MIT License
4.59k stars 474 forks source link

手动修改ChineseCLIPVisionModel to ChineseCLIPVisionModelWithProjection 失败。 #324

Open ranck626 opened 5 months ago

ranck626 commented 5 months ago

import torch
import torch.nn as nn

visual_projection = nn.Linear(768, 512, bias=False) embeds = visual_projection(pooled_output) 我人为添加了一个映射层,发现和ChineseCLIPModel求出来的编码不一样。

ranck626 commented 5 months ago

应该是预训练参数的问题,但是为啥只提供 没有projection的版本呢