RuntimeError: size mismatch, m1: [1 x 588], m2: [12 x 10] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:268

vainaixr commented 5 years ago

hello, does flashtorch work on ensemble model, I passed output of one neural network to another, and got error.

I used g_ascent.visualize(model.encoder[0], title='conv');

MisaOgura commented 5 years ago

Hi @vainaijr,

I would need more information in order to diagnose:

e.g. the model architecture, how you are instantiating the GradientAscent object etc,

Please also provide any other information you think would be useful.

Many thanks, Misa

vainaixr commented 5 years ago

hello, my model looks like this,

def conv_block(in_channels, out_channels, k):
    return nn.Sequential(
        nn.Conv2d(in_channels, in_channels, k, padding=0),
        nn.BatchNorm2d(in_channels),
        nn.ReLU(),
        nn.MaxPool2d(2)
    )

class Top(nn.Module):
  def __init__(self):
    super().__init__()
    self.encoder = conv_block(3, 3, 1)
    self.lin = nn.Linear(20, 10)
    self.childone = Second()
    self.childtwo = Second()
  def forward(self, x):
    # set_trace()
    a, b = self.childone(self.encoder(x)), self.childtwo(self.encoder(x))
    # print('top', a.shape, b.shape)
    out = torch.cat((a, b), dim=-1)
    return self.lin(out) 

class Second(nn.Module):
  def __init__(self):
    super().__init__()
    self.encoder = conv_block(3, 3, 1)
    self.lin = nn.Linear(20, 10)
    self.childone = Middle()
    self.childtwo = Middle()

  def forward(self, x):
    a, b = self.childone(self.encoder(x)), self.childtwo(self.encoder(x))
    # print('middle', a.shape, b.shape)
    out = torch.cat((a, b), dim=-1)
    return self.lin(out)

class Middle(nn.Module):
  def __init__(self):
    super().__init__()
    self.encoder = conv_block(3, 3, 1)
    self.lin = nn.Linear(20, 10)
    self.childone = Bottom()
    self.childtwo = Bottom()

  def forward(self, x):
    a, b = self.childone(self.encoder(x)), self.childtwo(self.encoder(x))
    # print('middle', a.shape, b.shape)
    out = torch.cat((a, b), dim=-1)
    return self.lin(out)

class Bottom(nn.Module):
  def __init__(self):
    super().__init__()
    self.encoder = conv_block(3, 3, 1)
    self.lin_one = nn.Linear(12, 10)
  def forward(self, x):
    # print('bottom', x.shape)
    out = self.encoder(x)
    return (self.lin_one(out.view(out.size(0), -1)))

model = Top()
model.to('cuda')
from flashtorch.activmax import GradientAscent
g_ascent = GradientAscent(model)
g_ascent.use_gpu = True

g_ascent.visualize(model.childtwo.childtwo.childtwo.encoder[0], title='conv');

I pass images from top neural network to second one, then to middle, then to bottom, and then get 10 probabilities, which I pass to the top.

I have to use flashtorch to visualize what each neural network learns, this is different from .deep learning where we have only one model, here I use multiple encoders, decoders, and pass output top to bottom, or bottom to top.

MisaOgura commented 5 years ago

Thanks @vainaijr,

GradientAscent is fairly architecture-agnostic.

For certain types of architecture, especially if the liner layers are interwoven and hence can't be separated, you might have to set the img_size to what the model expects. The default is img_size=224.

You can do so by passing it in on the object instantiation or by reassigning the attribute.

I.e.

g_ascent = GradientAscent(model, img_size=int)

Or

g_ascent.img_size = int

Let me know how it goes.

Many thanks, Misa

vainaixr commented 5 years ago

it started working, thanks Screenshot (465)

MisaOgura commented 5 years ago

That's great @vainaijr, looking forward to hearing what insights you gain with FlashTorch.

MisaOgura / flashtorch

RuntimeError: size mismatch, m1: [1 x 588], m2: [12 x 10] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:268 #12