channel swapping and Cross Modal Mamba question

stanny880913 commented 4 months ago

您好，想詢問關於channel swapping的程式碼位再哪裡呢？有點找不太到，謝謝

另外想詢問，關於Cross Modal Mamba，我看到在mamba_module.py中的

self.norm1 = LayerNorm(dim,'with_bias')
self.norm2 = LayerNorm(dim,'with_bias')

這是指兩個模態的特徵嘛？謝謝您

alexhe101 commented 4 months ago

channel swapping mamba: 参考code 第105行
是指对两个模态特征进行的layernorm

stanny880913 commented 4 months ago

channel swapping mamba: 参考code 第105行

是指对两个模态特征进行的layernorm

好的，謝謝謝您
了解，但我看您的dim皆為同一數值，想詢問若今天我的dim不相同，也是可以做處理的嘛？謝謝

stanny880913 commented 4 months ago

channel swapping mamba: 参考code 第105行

是指对两个模态特征进行的layernorm

好的，謝謝謝您

了解，但我看您的dim皆為同一數值，想詢問若今天我的dim不相同，也是可以做處理的嘛？以及想詢問若我的feature沒有一個輸出是類似residual_pan_f等等，我還能夠使用channel swapping嘛？還是我其實可以直接忽略，謝謝

alexhe101 commented 4 months ago

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合
没有residual的话可以忽略，这个是一种残差前置的写法

stanny880913 commented 4 months ago

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合

没有residual的话可以忽略，这个是一种残差前置的写法可以確認您的dim是特徵的channel數量沒錯嗎！謝謝

alexhe101 commented 4 months ago

是的，dim是特征channel的数量

stanny880913 commented 4 months ago

是的，dim是特征channel的数量

好的非常感謝您，另外想詢問以下是我的環境： Python 3.7.16 torch 1.9 CUDA 11.1 這版本是有辦法安裝mamba_ssm的嘛？因為我裝不太成功，謝謝您

alexhe101 commented 4 months ago

是的，dim是特征channel的数量

好的非常感謝您，另外想詢問以下是我的環境： Python 3.7.16 torch 1.9 CUDA 11.1 這版本是有辦法安裝mamba_ssm的嘛？因為我裝不太成功，謝謝您

建议升级一下cuda版本和pytorch版本

stanny880913 commented 4 months ago

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合

好的沒問題，我嘗試升級看看！

dim不同的话可以通过mlp/11 conv投影到同一维度再融合請問這段的意思是什麼呢？（mlp/11 conv）謝謝

alexhe101 commented 4 months ago

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合

好的沒問題，我嘗試升級看看！

dim不同的话可以通过mlp/1_1 conv投影到同一维度再融合請問這段的意思是什麼呢？（mlp/1_1 conv）謝謝

全连接层或者1x1 卷积用来调整通道维度

stanny880913 commented 4 months ago

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合

好的沒問題，我嘗試升級看看！ dim不同的话可以通过mlp/1_1 conv投影到同一维度再融合請問這段的意思是什麼呢？（mlp/1_1 conv）謝謝

全连接层或者1x1 卷积用来调整通道维度

x_img_down_sample_resize = []
        desired_channels = [64, 128, 256, 512]
        for i, x_img_i in enumerate(x_img):
                conv = nn.Conv2d(x_img_i.shape[1], desired_channels[i], kernel_size=1).to(device)
                output_tensor = conv(x_img_i)

我是透這樣的方式做降維，因為我有一個模態是256一個是64，所以打算將256降圍成64，以上作法是否合理呢？謝謝

alexhe101 commented 4 months ago

合理

dim不同的话可以通过mlp/1*1 conv投影到同一维度再融合

好的沒問題，我嘗試升級看看！ dim不同的话可以通过mlp/1_1 conv投影到同一维度再融合請問這段的意思是什麼呢？（mlp/1_1 conv）謝謝

全连接层或者1x1 卷积用来调整通道维度
x_img_down_sample_resize = []
        desired_channels = [64, 128, 256, 512]
        for i, x_img_i in enumerate(x_img):
                conv = nn.Conv2d(x_img_i.shape[1], desired_channels[i], kernel_size=1).to(device)
                output_tensor = conv(x_img_i)
我是透這樣的方式做降維，因為我有一個模態是256一個是64，所以打算將256降圍成64，以上作法是否合理呢？謝謝

合理

stanny880913 commented 4 months ago

您好，想再詢問一下我在mamba_simple.py看到有依段code是class mamba的def foward， batch, seqlen, dim = hidden_states.shape請問這行程式的seqlen代表什麼呢？因為我的輸入是[1, 64, 232, 400]分別代表[batchsize, channel, height, width]，請問seqlen是對應到哪一個呢？謝謝

alexhe101 commented 4 months ago

您好，想再詢問一下我在mamba_simple.py看到有依段code是class mamba的def foward， batch, seqlen, dim = hidden_states.shape請問這行程式的seqlen代表什麼呢？因為我的輸入是[1, 64, 232, 400]分別代表[batchsize, channel, height, width]，請問seqlen是對應到哪一個呢？謝謝

在我的任务中，seqlen是将(b,c,h,w)reshape成(b,hw,c)得到的，seqlen是h*w

stanny880913 commented 4 months ago

您好，想再詢問一下我在mamba_simple.py看到有依段code是class mamba的def foward， batch, seqlen, dim = hidden_states.shape請問這行程式的seqlen代表什麼呢？因為我的輸入是[1, 64, 232, 400]分別代表[batchsize, channel, height, width]，請問seqlen是對應到哪一個呢？謝謝

在我的任务中，seqlen是将(b,c,h,w)reshape成(b,hw,c)得到的，seqlen是h*w

了解，不過我再TokenSwapMamba中的ms_swap = self.msencoder(ms_swap)，遇到TypeError: 'NoneType' object is not callable，再這段程式碼

else:
                out = mamba_inner_fn(
                    xz,
                    self.conv1d.weight,
                    self.conv1d.bias,
                    self.x_proj.weight,
                    self.dt_proj.weight,
                    self.out_proj.weight,
                    self.out_proj.bias,
                    A,
                    None,  # input-dependent B
                    None,  # input-dependent C
                    self.D.float(),
                    delta_bias=self.dt_proj.bias.float(),
                    delta_softplus=True,
                )

由於self.msencoder = Mamba(dim,bimamba_type=None)中的bimamba_type=None而進到的else 我可以如何解決？

另外以想詢問您再做TokenSwapMamba時的參數都是放在cpu嘛？因為我再執行時會在

 if extra_emb is None:
            # We do matmul and transpose BLH -> HBL at the same time
            xz = rearrange(
                self.in_proj.weight @ rearrange(hidden_states, "b l d -> d (b l)"),
                "d (b l) -> b d l",
                l=seqlen,
            )

遇到參數不再同設備上的問題，一個在cpu一個在CUDA0，我的解ˋ法是把hidden_states也放到cpu上，這樣對於後續是可行的嗎？謝謝

alexhe101 commented 4 months ago

self.msencoder = Mamba(dim,bimamba_type=None)将None改为'v1'
参数应该都在gpu上

stanny880913 commented 4 months ago

self.msencoder = Mamba(dim,bimamba_type=None)将None改为'v1'

参数应该都在gpu上

self.in_proj.weight 但是這個參數.weight是放在cpu上欸，可以怎麼做更改

alexhe101 commented 4 months ago

self.msencoder.cuda()

stanny880913 commented 4 months ago

self.msencoder.cuda()

謝謝，我再來試試看！

stanny880913 commented 4 months ago

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，請問這樣是正常的嘛？謝謝

stanny880913 commented 4 months ago

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

alexhe101 commented 4 months ago

您好，进crossmamba的时候要reshape变成(b,n,c)的维度

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

stanny880913 commented 4 months ago

您好，进crossmamba的时候要reshape变成(b,n,c)的维度

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

請問(b,n,c)分別是batch, seqlen, channel嘛？

alexhe101 commented 4 months ago

您好，进crossmamba的时候要reshape变成(b,n,c)的维度

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

請問(b,n,c)分別是batch, seqlen, channel嘛？

是的

stanny880913 commented 4 months ago

您好，进crossmamba的时候要reshape变成(b,n,c)的维度

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

請問(b,n,c)分別是batch, seqlen, channel嘛？

是的

了解，那假設我是這樣呼叫mamba_block = CrossMamba(input_channels).to(device) ，我的input_channels對應到的應該是batch, seqlen, channel中的channe嘛？

alexhe101 commented 4 months ago

您好，进crossmamba的时候要reshape变成(b,n,c)的维度

self.msencoder.cuda()

您好，我的特徵的shape本來是[1, 64, 232, 400]，經過channel swapping後變成[1, 92800, 64]，我的作法是直接reshap回我的原始尺寸大小[1, 64, 232, 400]，請問這樣是可以的？我這麼做後，再進入到crossMamba中的pan = self.norm2(pan)中的class LayerNorm，接著進入到class WithBias_LayerNorm 的return (x - mu) / torch.sqrt(sigma+1e-5) * self.weight + self.bias，會出現RuntimeError: The size of tensor a (512) must match the size of tensor b (29) at non-singleton dimension 2，可以如何解決呢？謝謝您

請問(b,n,c)分別是batch, seqlen, channel嘛？

是的

了解，那假設我是這樣呼叫mamba_block = CrossMamba(input_channels).to(device) ，我的input_channels對應到的應該是batch, seqlen, channel中的channe嘛？

是的

alexhe101 / Pan-Mamba

channel swapping and Cross Modal Mamba question #8