TencentARC / ViT-Lens

[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
https://ailab-cvc.github.io/seed/vitlens/
Other
140 stars 9 forks source link

点云和文本输出结果不对 #6

Closed Royalvice closed 6 months ago

Royalvice commented 6 months ago

运行example.py得到如下结果 PointCould x Text: tensor([[8.5200e-04, 9.5644e-02, 5.8601e-01, 1.9369e-02, 2.9812e-01], [8.8911e-04, 1.7004e-01, 3.2570e-01, 1.1302e-02, 4.9207e-01], [2.9327e-04, 6.9276e-02, 4.6433e-01, 1.2254e-02, 4.5384e-01], [1.9555e-03, 7.8262e-02, 3.8027e-01, 5.8164e-02, 4.8135e-01], [3.0467e-04, 1.0489e-01, 4.9719e-01, 2.1044e-02, 3.7657e-01]], device='cuda:0')

StanLei52 commented 6 months ago

你好,感谢指出错误。模型配置已经更新: link,请更新模型配置后重新运行example.py。

== Thanks for pointing this out. We have updated the model config here for the released vitlensL. You may pull the latest code and run example.py to get the expected result.