ypwhs / captcha_break

验证码识别
MIT License
2.72k stars 686 forks source link

cnn模型构建的问题 #36

Open MgArcher opened 5 years ago

MgArcher commented 5 years ago

在构建cnn模型的时候:

input_tensor = Input((height, width, 3))
    x = input_tensor
    for i, n_cnn in enumerate([2, 2, 2, 2, 2]):
      ############
        for j in range(n_cnn):
            # 这里为什么要循环两次呢?
            x = Conv2D(32 * 2 ** min(i, 3), kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
            x = BatchNormalization()(x)
            x = Activation('relu')(x)
        x = MaxPooling2D(2)(x)

    x = Flatten()(x)
    x = [Dense(n_class, activation='softmax', name='c%d' % (i + 1))(x) for i in range(n_len)]
    model = Model(inputs=input_tensor, outputs=x)
ypwhs commented 5 years ago

因为我想构建 [2, 2, 2, 2, 2] 的结构,这样写比较方便。

你也可以写成这样:

x = Conv2D(32, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(32, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(2)(x)
x = Conv2D(64, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(64, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(2)(x)
x = Conv2D(128, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(128, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(2)(x)
x = Conv2D(256, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(256, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(2)(x)
x = Conv2D(256, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Conv2D(256, kernel_size=3, padding='same', kernel_initializer='he_uniform')(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D(2)(x)

这样写比较清晰,但是想调成其他结构比较费劲。

MgArcher commented 5 years ago

感谢您的回复!我能不能请问一下您是如何确定的模型结构呢?比如说为什么要用相同的卷积核卷积两次呢?是会加速模型收敛吗?

ypwhs commented 5 years ago

感谢您的回复!我能不能请问一下您是如何确定的模型结构呢?比如说为什么要用相同的卷积核卷积两次呢?是会加速模型收敛吗?

看论文,学习别人的模型结构设计方法,针对这个问题,可以看这篇论文:

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition https://arxiv.org/abs/1507.05717 Submitted on 21 Jul 2015

如果希望设计通用的模型结构,可以看下面的参考资料

模型论文

参考链接:https://paperswithcode.com/sota/image-classification-on-imagenet

VGG

Very Deep Convolutional Networks for Large-Scale Image Recognition
https://arxiv.org/abs/1409.1556
Submitted on 4 Sep 2014

ResNet

Deep Residual Learning for Image Recognition
https://arxiv.org/abs/1512.03385
Submitted on 10 Dec 2015

InceptionV3

Rethinking the Inception Architecture for Computer Vision
https://arxiv.org/abs/1512.00567
Submitted on 2 Dec 2015

InceptionResNetV2

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
https://arxiv.org/abs/1602.07261
Submitted on 23 Feb 2016

DenseNet

Densely Connected Convolutional Networks
https://arxiv.org/abs/1608.06993
Submitted on 25 Aug 2016

Xception

Xception: Deep Learning with Depthwise Separable Convolutions
https://arxiv.org/abs/1610.02357
Submitted on 7 Oct 2016

NASNet

Neural Architecture Search with Reinforcement Learning
https://arxiv.org/abs/1611.01578
Submitted on 5 Nov 2016

AmoebaNet

Regularized Evolution for Image Classifier Architecture Search
https://arxiv.org/abs/1802.01548
Submitted on 5 Feb 2018

EfficientNet

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
https://arxiv.org/abs/1905.11946
Submitted on 28 May 2019

MgArcher commented 5 years ago

感谢您的回复,对我帮助很大,谢谢!