HHHedo / IBMIL

CVPR 2023 Highlight
59 stars 9 forks source link

Error in NyStromAttention #15

Closed bryanwong17 closed 5 months ago

bryanwong17 commented 6 months ago
class TransLayer(nn.Module):

    def __init__(self, norm_layer=nn.LayerNorm, dim=512):
        super().__init__()
        self.norm = norm_layer(dim)
        self.attn = NystromAttention(
            dim = dim,
            dim_head = dim//8,
            heads = 8,
            num_landmarks = dim//2,    # number of landmarks
            pinv_iterations = 6,    # number of moore-penrose iterations for approximating pinverse. 6 was recommended by the paper
            residual = True,         # whether to do an extra residual with the value or not. supposedly faster convergence if turned on
            dropout=0.1
        )

    def forward(self, x):
        print(self.attn(self.norm(x)).shape)
        x = x + self.attn(self.norm(x))

        return x

RuntimeError: Output 0 of PermuteBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

Hi, I was wondering if you have ever encountered this issue. If so, how did you solve it? Thank you!

HHHedo commented 6 months ago

Hi,I did not encounter this issue. I remember the version of NystromAttention is 0.0.9. Maybe you can check whether it is caused by the version.Best,Tiancheng在 2024年1月20日,15:56,Bryan Wong @.***> 写道: class TransLayer(nn.Module):

def __init__(self, norm_layer=nn.LayerNorm, dim=512):
    super().__init__()
    self.norm = norm_layer(dim)
    self.attn = NystromAttention(
        dim = dim,
        dim_head = dim//8,
        heads = 8,
        num_landmarks = dim//2,    # number of landmarks
        pinv_iterations = 6,    # number of moore-penrose iterations for approximating pinverse. 6 was recommended by the paper
        residual = True,         # whether to do an extra residual with the value or not. supposedly faster convergence if turned on
        dropout=0.1
    )

def forward(self, x):
    print(self.attn(self.norm(x)).shape)
    x = x + self.attn(self.norm(x))

    return x

RuntimeError: Output 0 of PermuteBackward0 is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one. Hi, I was wondering if you have ever encountered this issue. If so, how did you solve it? Thank you!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

bryanwong17 commented 6 months ago

I did download version 0.0.9, and although I made some changes to your code in the preprocessing to improve its speed (though I don't believe this is the cause of the issue), I just want to double-check: is the shape of 'x' [1, N, 512]?

HHHedo commented 6 months ago

The shape of x is  [1, N, 512] and  [1, N, 768] for ResNet18 and CTransPath.在 2024年1月20日,21:44,Bryan Wong @.***> 写道: I did download version 0.0.9, and although I made some changes to your code in the preprocessing to improve its speed (though I don't believe this is the cause of the issue), I just want to double-check: is the shape of 'x' [1, N, 512]?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

bryanwong17 commented 6 months ago

I managed to solve the problem by changing manually this line of code inside of the package

q *= self. scale

to

q = q * self.scale

bryanwong17 commented 6 months ago

Hi @HHHedo , what is the shape of X for ViT small with MoCoV3? When I printed it out, I got [1, N, 256]. Shouldn't it be 384 instead of 256?

HHHedo commented 6 months ago

Hi @HHHedo , what is the shape of X for ViT small with MoCoV3? When I printed it out, I got [1, N, 256]. Shouldn't it be 384 instead of 256?

I have the same question, and please see here.

bryanwong17 commented 6 months ago

Hi @HHHedo, could you explain more about "both the head and predictor should be nn.Identity()". What parts of code did you change? When I added these lines of code

feature_extractor.head = nn.Identity()
feature_extractor.predictor = nn.Identity()

There was an error when loading the pretrained ViT small MoCoV3

HHHedo commented 5 months ago

Maybe you can try

feature_extractor.base_encoder.head = nn.Identity()
feature_extractor.predictor= nn.Identity()

Please print the 'state_dict' of pretrained weights, which is a dictionary. I believe these problems can be easily solved by debugging.

bryanwong17 commented 5 months ago

Thank you for your help!