MoyanZitto / keras-cn

Chinese keras documents with more examples, explanations and tips.
1.57k stars 280 forks source link

用keras做了一个最基础的CNN模型,然而飞了 #97

Open guotch opened 7 years ago

guotch commented 7 years ago

知乎看山杯 相同的输入输出用1层RNN大概可以到50%的MAX@1准确率,然而用了一个基本的CNN网络来预测,结果发现MAX@1准确率只有4% CNN网络模型是 kernel为3/4/5的三个卷积+最大池化 然后merge起来以后一个全连接层,最后接sigmoid的预测。 结果跑了两次,出现了非常神奇的情况。 第一次的结果是近乎于全员预测为0;TOP5的预测值如下;; 5.006163661391838e-08,4.9472454577426106e-08,4.7214594900424345e-08,4.583506907351875e-08,4.46133014975203e-08 第二次的结果是近乎于全员预测为0:TOP5预测值都是1 1.0,1.0,1.0,1.0,1.0,

感觉应该是参数直接飞了的原因。请教各位大佬如何做相应的参数初始化。 filter_sizes=[3,4,5]

embedding = Embedding (input_dim=line_num + 1, output_dim=EMBEDDING_DIM, input_length=chwo_seq_length,weights=[emb]) (inputs)
reshape = Reshape ((chwo_seq_length, EMBEDDING_DIM, 1)) (embedding)
conv_0 = Convolution2D (batch_size, filter_sizes[0], EMBEDDING_DIM, border_mode='valid', init='normal',
                        activation='relu', dim_ordering='tf') (reshape)
conv_1 = Convolution2D (batch_size, filter_sizes[1], EMBEDDING_DIM, border_mode='valid', init='normal',
                        activation='relu', dim_ordering='tf') (reshape)
conv_2 = Convolution2D (batch_size, filter_sizes[2], EMBEDDING_DIM, border_mode='valid', init='normal',
                        activation='relu', dim_ordering='tf') (reshape)

maxpool_0 = MaxPooling2D (pool_size=(chwo_seq_length - filter_sizes[0] + 1, 1), strides=(1, 1), border_mode='valid',
                          dim_ordering='tf') (conv_0)
maxpool_1 = MaxPooling2D (pool_size=(chwo_seq_length - filter_sizes[1] + 1, 1), strides=(1, 1), border_mode='valid',
                          dim_ordering='tf') (conv_1)
maxpool_2 = MaxPooling2D (pool_size=(chwo_seq_length - filter_sizes[2] + 1, 1), strides=(1, 1), border_mode='valid',
                          dim_ordering='tf') (conv_2)
merged_tensor = merge ([maxpool_0, maxpool_1, maxpool_2], mode='concat', concat_axis=1)
flatten = Flatten () (merged_tensor)
dropout = Dropout (drop) (flatten)
dense1 = Dense (LABEL_NUM)(dropout) # Dense full link layer
output = Activation ("sigmoid")(dense1)
model.compile (loss=“binary_crossentropy", optimizer=adam, metrics=['accuracy'])
model.fit (x_train, y_train, batch_size=batch_size, epochs=epoch,  validation_data=(x_val, y_val) )