multi-fusion 如何实现？

MendelXu / SAN

Open-vocabulary Semantic Segmentation

https://mendelxu.github.io/SAN/

MIT License

295 stars 27 forks source link

Closed APeiZou closed 9 months ago

APeiZou commented 10 months ago

@MendelXu Hello，论文中说的multi-fusion 代码里面如何实现呢？

APeiZou commented 10 months ago

@stupidZZ 您好，论文中对于Clip视觉编码输入的部分跟CLIP的预训练模型不一样，对应的positional_embedding数量需要如何改变呢？

MendelXu commented 10 months ago

可以看代码，fusion就是简单的映射到同一维度相加，position embedding是通过插值实现的。