关于模型结构 - Githubissues

我再pytorch根据model/ocr/chinese/ocr.cfg复现了这个网络，并根据要求输入了[1, 1, 32, 256]的图像，发现网络的输出尺寸为[1, 11316, 3, 63]，请问这个输出的含义是什么呢？按照我的理解，输出是[1, 11361, 1, n]，其中11361表示11361个汉字的prob，n表示生成的文字序列的长度。不知道是哪里出了问题，求指教！（没用过darknet，所以不知道如何查看网络结构的实际实现）

`class CRNN(nn.Module): def init(self, imgC): super(CRNN, self).init() self.conv1 = nn.Conv2d(imgC, 64, 3, 1, 1) self.relu1 = nn.ReLU() self.mpool1 = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(64, 128, 3, 1, 1) self.relu2 = nn.ReLU() self.mpool2 = nn.MaxPool2d(2, 2) self.conv3 = nn.Conv2d(128, 256, 3, 1, 1) self.relu3 = nn.ReLU() self.conv4 = nn.Conv2d(256, 256, 3, 1, 1) self.relu4 = nn.ReLU() self.mpool3 = nn.MaxPool2d(2, (2,1), 0) self.conv5 = nn.Conv2d(256, 512, 3, 1, 1) self.relu5 = nn.ReLU() self.conv6 = nn.Conv2d(512, 512, 3, 1, 1) self.relu6 = nn.ReLU() self.mpool4 = nn.MaxPool2d(2, (2, 1), 0) self.conv7 = nn.Conv2d(512, 512, 2, 1, 0) self.relu7 = nn.ReLU() self.conv8 = nn.Conv2d(512, 11316, 1, 1, 1)

def forward(self, x):
    x = self.mpool1(self.relu1(self.conv1(x)))
    x = self.mpool2(self.relu2(self.conv2(x)))
    x = self.relu3(self.conv3(x))
    x = self.mpool3(self.relu4(self.conv4(x)))
    x = self.relu5(self.conv5(x))
    x = self.mpool4(self.relu6(self.conv6(x)))
    x = self.relu7(self.conv7(x))
    x = self.conv8(x)
    return x`

chineseocr / darknet-ocr

关于模型结构 #97