-
Hi JUNJIE. In "train.bash," I found that you locked the text tower and only trained the vision tower. The weights of the text tower (BGE) are already pre-trained (BAAI/bge-base-en-v1.5), so during the…
-
Can't pickle local object 'load_mosi..MOSI'
-
the result seems greatly different from the paper. How can i reproduce the result of av-mnist?
Here is my input:
python main.py --data_root '/home/tankedge/newdisk/xjh/avmnist' --device cuda:0 --met…
-
Thank you very much for your colourful work. May I ask if you can share the datasets in your paper? I look forward to your reply.
-
2.5D Architecture --> patch extraction performed from three anatomical planes.
Currently, we are only using patches extracted from the axial plane, which is optimal for axial data (e.g. T2-ax, T2st…
-
Hi,
I notice that you generate instance masks for KITTI using the maskformer pretrained on coco-panoptic. I am wondering whether the domain gap between COCO and KITTI will lead to unsatisfied in…
-
Knowledge-Guided Dynamic Modality Attention Fusion Framework for Multimodal Sentiment Analysis
Xinyu Feng, Yuming Lin, Lihua He, You Li, Liang Chang, Ya Zhou
-
Hi, thanks for this amazing project. I was trying to finetune the lora model for Llama3.2 Vision which works fine and saved a adapter_0.pt; Then I wanted to use this adapter checkpoint for inference i…
-
请问这个项目没有提供 模态融合(图像、文本)的代码,只是提供了微调的代码吗?在代码中没有发现处理文本的网络以及加载文本数据。
-
**Bug Report Checklist**
- [x] I have provided code that demonstrates a minimal reproducible example.
- [x] I have confirmed the bug exists on the latest mainline of AutoGluon via a source install…