如何将DEQ融合方法应用到自己的网络架构中？

Jinzeyuu commented 5 months ago

您好，有幸拜读了您的论文，DEQ多模态融合方法是目前较为流行的动态建模方法。您的文件中没有详细描述如何将DEQ融合方法应用到自己的网络架构中，可以进行补充吗？我在自己的网络中加入了三个py文件，将两种特征类型输入到网络中，但出现了特征维度匹配错误。我不知道如何进行修改？以及针对我的网络结构中还有没有其他地方的结构对应需要提前设置，如果您有时间的话，能否做一下讲解？我的联系方式1965782319@qq.com

jinhong-ni commented 5 months ago

Hi there, thanks for your interest in our work.

Generally speaking, to deploy DEQ fusion module into your framework, first you need to copy the three files DEQ_fusion.py, solver.py, and jacobian.py into your repo. You can then instantiate DEQFusion module as your fusion module, and forward through the fusion module once you extract the unimodal features. If you have different feature dimension across modalities, a workaround is to simply apply a MLP to ensure they have the same feature dimension.

More specifically, I could provide you with a short pseudocode example for illustration purposes:

from DEQ_fusion import DEQFusion

class Model(nn.Module):
  def __init__(self):
    self.modal1_encoder= Encoder1() # assume this produces feature of dimension Bx256
    self.modal2_encoder = Encoder2() # assume this produces feature of dimension Bx512
    self.feature_proj = Linear(256, 512)
    self.fusion = DEQFusion(512, 2) # 512 indicates the feature dimension and 2 indicates the number of modalities used
    ... # define other modules

  def forward(self, x1, x2):
    f1, f2 = self.modal1_encoder(x1), self.modal2_encoder(x2)
    f1 = self.feature_proj(f1)
    fused = self.fusion([f1,f2])
    ... # perform other operations

Hope this simple code snippet helps! Apology that I do not have a Chinese keyboard on my current device, so I may write the response in English. Please let me know if you have further concerns.

Jinzeyuu commented 5 months ago

您好，感谢您对我们工作的关注。

一般来说，要将 DEQ 融合模块部署到框架中，首先需要将 DEQ_fusion.py、solver.py 和 jacobian.py 三个文件复制到存储库中。然后，您可以将模块实例化为融合模块，并在提取单峰特征后转发融合模块。如果不同模态的特征维度不同，则解决方法是简单地应用 MLP 以确保它们具有相同的特征维度。DEQFusion

更具体地说，我可以为您提供一个简短的伪代码示例以进行说明：
from DEQ_fusion import DEQFusion

class Model(nn.Module):
  def __init__(self):
    self.modal1_encoder= Encoder1() # assume this produces feature of dimension Bx256
    self.modal2_encoder = Encoder2() # assume this produces feature of dimension Bx512
    self.feature_proj = Linear(256, 512)
    self.fusion = DEQFusion(512, 2) # 512 indicates the feature dimension and 2 indicates the number of modalities used
    ... # define other modules

  def forward(self, x1, x2):
    f1, f2 = self.modal1_encoder(x1), self.modal2_encoder(x2)
    f1 = self.feature_proj(f1)
    fused = self.fusion([f1,f2])
    ... # perform other operations
希望这个简单的代码片段对您有所帮助！很抱歉，我目前的设备上没有中文键盘，所以我可能会用英文写回复。如果您有其他顾虑，请告诉我。

很感谢您的回复，我的两个模态的特征shape分别为（4，128，15）和（4，128，144）。第一个维度是batch size，第二个维度是特征维度，第三个维度为特征数量。按照您的说明，我将特征数量15映射为144，使得两个特征维度匹配并且将channel_dim设为128，但报错结果显示RuntimeError: mat1 and mat2 shapes cannot be multiplied (64x128 and 144x144)，我不太懂哪里出了问题，是否是输入的特征只允许二维特征向量呢？

jinhong-ni commented 5 months ago

I see what's going on. The module specified in https://github.com/jinhong-ni/DEQFusion/blob/main/DEQ_fusion.py assumes you have features of shape (B, C). In your case, you do not need to align the feature as the channel dimension already matches between the two modalities. From what I recall, the modalities for CMU-MOSI have similar feature dimensions as what you have specified. Please refer to the difference between https://github.com/jinhong-ni/DEQFusion/blob/main/experiments/CMU-MOSI/model.py#L724 and https://github.com/jinhong-ni/DEQFusion/blob/main/DEQ_fusion.py#L139, and modify the function featureFusion accordingly should work out of the box!

Jinzeyuu commented 5 months ago

我明白发生了什么。https://github.com/jinhong-ni/DEQFusion/blob/main/DEQ_fusion.py 中指定的模块假定您具有形状（B、C）的特征。据我所知，CMU-MOSI的模态具有与您指定的相似的特征尺寸。请参考 https://github.com/jinhong-ni/DEQFusion/blob/main/experiments/CMU-MOSI/model.py#L724 和 https://github.com/jinhong-ni/DEQFusion/blob/main/DEQ_fusion.py#L139 之间的区别，并相应地修改功能，应该开箱即用！featureFusion

好的，十分感谢您的回复，接下来我将认真去查找问题所在，如果后续有其他问题，再来请教您！

jinhong-ni / DEQFusion

如何将DEQ融合方法应用到自己的网络架构中？ #1