yalecyu / crnn.caffe

crnn.caffe
Other
76 stars 51 forks source link

测试结果出现了很大问题 #18

Open plastic0313 opened 6 years ago

plastic0313 commented 6 years ago

我训练模型达到了0.96的accurary,用./build/examples/cpp_recognition/recognition.bin data/testimage/1.png examples/crnn/deploy.prototxt examples/crnn/model/crnn_captcha_iter_20000.caffemodel测试的时候无论用哪张图,结果只能显示74 74 74 1 74 4 74 74 1 74 74 4 74 74 1 74 74 74 74 74 74 74 74 74 74 74 74 74 74 74 74 74这样的数字串,请问是什么原因造成的呢

ghost commented 6 years ago

@plastic0313 我也有这样的问题,请问您的解决了么

yalecyu commented 6 years ago

@plastic0313 @greatgeekgrace 因为的自己的分类个数是74,你要参考你自己的分类个数,更改generate_dataset.py num_output alphabet_size等参数。

ghost commented 6 years ago

@yalecyu 好的,非常感谢~~~目前预测的captcha图片(图片为0642)结果: 74 74 0 74 6 74 4 74 2 74 74 74 74 74 74 74 74 74 74 74 - - - - 看起来结果有24个字符(上面的结果一个一个数的),可是在crnn.prototxt和deploy.prototxt中num_output设置的是75,是怎么回事呢。而且还出现了-符号?

xijunjun commented 5 years ago

一开始未做任何改动,完全按照步骤生成数据,然后训练报错,维度不匹配,根据相关问题里的解答,将reshape里的24和timestep的24都改为32后(24应该是宽度为96时的设置,而当前的图片宽度是128),可以顺利train起来,并在训练集和验证集都达到99的正确率。但是测试训练集图片时的输出都完全不对。需要修改两个地方:1.默认的验证码数据的单个字符仅包含0-9再加上空白标记总共11个,所以blank_label: 10,alphabet_size: 11,fc 层num_output: 11。2.训练时图片数据是没有归一化到0-1的,而recognition.cpp中sample_resized.convertTo(sample_float, CV_32FC3)将图片像素值归一化到0-1,将其改为sample_resized.convertTo(sample_float, CV_32FC3, 1/255.0)。

yalecyu commented 5 years ago

@xijunjun 对,主要注意的是,因为另一个OCR的项目,我更改了prototxt的配置,没有用生成数据集验证是否维度匹配。另一个需要注意的就是0-1和0-255,但是没有验证过,只是README里面给出提示。有时间补了这些坑。

dingtao1 commented 5 years ago

@xijunjun 哪儿可以知道训练的时候是归一化的?convertTo(sample_float, CV_32FC3, 1/255.0)这个函数的作用不是归一化吗?

xijunjun commented 5 years ago

@dingtao1 我是看了下数据制作代码和数据输入层参数

yjtan118 commented 5 years ago

Hi, sorry for posting on an old discussion, but i need some help or hints as I can't seem to get consistent and correct results for my own trained crnn model after following all the steps. I ported the Linux code and compiled this on Windows and Visual Studio 2017 compiler. I managed to compile the codes successfully after making some changes, but I supposed this shouldn't affect the results. 1) First I generated dataset using generate_captcha.py. Total image size is 50,000. 2) Then execute generate_dateset.py. IMAGE_WIDTH, IMAGE_HEIGHT = 128, 32.
Training size = 40,000 and Test size = 10,000. 3) In my crnn.prototxt, I changed batch size to 50 to cater for my GPU which only have 2 MB memory. I changed the following as well: layer { name: "reshape" type: "Reshape" bottom: "conv6" top: "reshape" reshape_param { shape {

nc(w*h)

        dim: **50**
        dim: 512
        dim: **32**
    }
}

} layer { name: "indicator" type: "ContinuationIndicator" top: "indicator" continuation_indicator_param { time_step: 32 batch_size: 50 } } layer { name: "ctc_loss" type: "CtcLoss" bottom: "fc1" bottom: "label" top: "ctc_loss" loss_weight: 1.0 ctc_loss_param { blank_label: 10 alphabet_size: 11 time_step: 32 } } layer { name: "accuracy" type: "LabelsequenceAccuracy" bottom: "premuted_fc" bottom: "label" top: "accuracy" labelsequence_accuracy_param { blank_label: 10 } } I managed to get over 0.95 accuracy for both test and train data. Loss seems to be on the low side as well (0.00x).

4) Next, I change the deploy.prototxt: name: "crnn" layer { name: "data" type: "Input" top: "data" input_param {shape:{dim:1 dim:3 dim:32 dim:128}} } layer { name: "reshape" type: "Reshape" bottom: "conv6" top: "reshape" reshape_param { shape {

nc(w*h)

        dim: 1
        dim: 512
        dim: **32**
    }
}

} layer { name: "indicator" type: "ContinuationIndicator" top: "indicator" continuation_indicator_param { time_step: 32 batch_size: 1 } } layer { name: "fc1" type: "InnerProduct" bottom: "lstm2" top: "fc1" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } inner_product_param { num_output: 11 axis: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } }

5) I also amend the recognition.cpp to include the normalization: if (numchannels == 3)sample_resized.convertTo(sample_float, CV_32FC3, 1.f/255); else sample_resized.convertTo(sample_float, CV_32FC1, 1.f/255);

for the output, when i run the recognition exe such as below: recognition D:\ImageProc\ImgDataset\Data\Captcha\49998-7959.png D:\Lib\caffecrnn\examples\crnn\deploy.prototxt D:\Lib\caffecrnn\examples\crnn\model\crnn_captcha_iter_3600.caffemodel i can't get consistent results from the model each time, and i can't get an accurate output as well: Output that I get if i run it for three times: 8 9 8 8 8 8 8 8 8 8 8 8 8 8 8 4 4 4 4 4 1 1 1 - 1 1 1 0 0 - 1 2

6 6 - - 9 9 - - - - - - - - - - - - 6 6 6 6 6 6 6 6 6 6 6 6 3 3

1 1 2 1 1 8 8 8 8 1 - 1 1 1 1 1 1 1 1 7 7 7 7 7 7 7 7 7 7 7 6 1

Anyone have any hints or detected where I have a mistake? Anyone managed to get accurate output from the trained model?

Please help! Thank you.

BarryKCL commented 5 years ago

(数字+英文字母)测试图从BGR转RGB可以解决训练过程中测试准确率很高,但是cpp_recognition输出结果不对的问题!!!

~~~~~~~~~~~~~~~~~~原因如下~~~~~~~~~~~~~~~~~~~~

我们做数据的时候:img = caffe.io.load_image(os.path.join(img_path, image)) caffe.io里面:img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32) 问题所在:cv2的存储格式是BGR,而skimage的存储格式是RGB(recognition.cpp里面的读图是用opencv,使用cv::cvtColor(resizeimg, resizeimg, cv::COLOR_BGR2RGB);)