jinhong-ni / DEQFusion

PyTorch Implementation of Deep Equilibrium Multimodal Fusion
14 stars 2 forks source link

输入为两个模态的三维特征向量应修改代码的哪些部分? #2

Closed Jinzeyuu closed 5 months ago

Jinzeyuu commented 5 months ago

很抱歉再次打扰您,我根据您的建议尝试对应的修改了feature_fusion模块,并且仿照cmu-mosi的solver.py更改自己的求解器。但是结果不如人意,能否详细告知需要修改的地方呢?我的输入为fused_features, jacobian_loss, trace = self.deq_fusion(x_visual, x_audio),x_visual, x_audio均为(4,128,15)的特征向量。

jinhong-ni commented 5 months ago

You shouldn't need to modify solver.py as this file mostly contains algorithms for finding fixed points. Please try to adjust the function featureFusion according to CMU-MOSI experiments for 3-dimensional input features.

Jinzeyuu commented 5 months ago

您不需要修改,因为此文件主要包含用于查找不动点的算法。请尝试根据 CMU-MOSI 实验调整功能,以获得 3 维输入特征。solver.py``featureFusion

那么我使用的是主文件夹下的DEQ_fusion.py solver.py jacobian.py还是cmu-mosi文件夹下的呢?因为他们的部分内容不相同

jinhong-ni commented 5 months ago

Modify DEQ_fusion.py according to https://github.com/jinhong-ni/DEQFusion/blob/main/experiments/CMU-MOSI/model.py

Jinzeyuu commented 5 months ago

根据 https://github.com/jinhong-ni/DEQFusion/blob/main/experiments/CMU-MOSI/model.py 进行修改DEQ_fusion.py

我根据您的建议仿照model.py做了SmallBlock、DEQFusionBlock、DEQEQFusionModule的修改。但仍然出现了以下错误。 File "E:\emotion recognition\√multimodal-emotion-recognition-main\DEQ_fusion.py", line 196, in forward fused_feat, jacobian_loss, trace = self.featureFusion(x_visual, x_audio, fusion_feature) File "E:\emotion recognition\√multimodal-emotion-recognition-main\DEQ_fusion.py", line 167, in featureFusion result = self.f_solver(func, z1, threshold=self.f_thres, stop_mode=self.stop_mode) File "E:\emotion recognition\√multimodal-emotion-recognition-main\solver.py", line 259, in anderson X[:,0], F[:,0] = x0.reshape(bsz, -1), f(x0).reshape(bsz, -1) File "E:\emotion recognition\√multimodal-emotion-recognition-main\DEQfusion.py", line 149, in func = lambda z: list2vec(self.func(vec2list(z, cutoffs), x_list)) File "D:\Anaconda3\envs\emotion\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, *kwargs) File "E:\emotion recognition\√multimodal-emotion-recognition-main\DEQ_fusion.py", line 110, in forward out = self.branches[i](x[i], inject_features[i]) File "D:\Anaconda3\envs\emotion\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(input, **kwargs) File "E:\emotion recognition\√multimodal-emotion-recognition-main\DEQ_fusion.py", line 52, in forward out = self.conv1(x) + injection_feature RuntimeError: The size of tensor a (128) must match the size of tensor b (15) at non-singleton dimension 2 通过我的查询,x.shape为(128,128,128), injection_feature.shape为(4,128,15)。我不知道问题出在哪里,如果您方便的话,可以帮助我解决一下这个问题吗?

jinhong-ni commented 5 months ago

It is hard to tell based on the error log. Ideally, x should have the same shape as injection_feature.