PaddlePaddle / models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.
Apache License 2.0
6.9k stars 2.91k forks source link

OCR运行错误,请问如何需要修改 #2616

Closed githubusr1 closed 5 years ago

githubusr1 commented 5 years ago

我运行了OCR模型,有如下错误信息,请问如何修改。说明一下,train.py改名为testtx3.py, 放到了E:\paddletest目录下,其他文件也复制过来,运行python testtx3.py,以下就是错误信息,文件train.py中只把use_cube改为False,其他没改。我的paddlepaddle是1.4.1版本,python是3.7.1版本。

----------- Configuration Arguments ----------- average_window: 0.15 batch_size: 32 eval_period: 15000 init_model: None log_period: 1000 max_average_window: 12500 min_average_window: 10000 model: crnn_ctc parallel: False profile: False save_model_dir: ./models save_model_period: 15000 skip_batch_num: 0 skip_test: False test_images: None test_list: None total_step: 720000 train_images: None train_list: None use_gpu: False

C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\evaluator.py:71: Warning : The EditDistance is deprecated, because maintain a modified program inside eva luator cause bug easily, please use fluid.metrics.EditDistance instead. % (self.class.name, self.class.name), Warning)

Traceback (most recent call last): File "testtx3.py", line 217, in main() File "testtx3.py", line 213, in main train(args) File "testtx3.py", line 147, in train results = train_one_batch(data) File "testtx3.py", line 108, in train_one_batch fetch_list=fetch_vars) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\executor.py", li ne 565, in run use_program_cache=use_program_cache) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\executor.py", li ne 642, in _run exe.run(program.desc, scope, 0, True, True, fetch_var_name) paddle.fluid.core.EnforceNotMet: Invoke operator elementwise_sub error. Python Callstacks: File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\framework.py", l ine 1654, in append_op attrs=kwargs.get("attrs", None)) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\layer_helper.py" , line 43, in append_op return self.main_program.current_block().append_op(*args, kwargs) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\layers\nn.py", l ine 9232, in _elementwise_op 'use_mkldnn': use_mkldnn}) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\layers\nn.py", l ine 9281, in elementwise_sub return _elementwise_op(LayerHelper('elementwise_sub', locals())) File "C:\ProgramData\Anaconda3\lib\site-packages\paddle\fluid\evaluator.py", l ine 268, in init x=seq_num, y=seq_right_count) File "E:\paddletest\crnn_ctc_model.py", line 195, in ctc_train_net input=decoded_out, label=casted_label) File "testtx3.py", line 60, in train args, data_shape, num_classes) File "testtx3.py", line 213, in main train(args) File "testtx3.py", line 217, in main() C++ Callstacks: Tensor holds the wrong type, it holds int at [D:\1.4.1\paddle\paddle/fluid/frame work/tensor_impl.h:29] PaddlePaddle Call Stacks: Windows not support stack backtrace yet.

gavin1332 commented 5 years ago

辛苦贴一下你使用的ocr模型的路径

githubusr1 commented 5 years ago

辛苦贴一下你使用的ocr模型的路径

https://github.com/PaddlePaddle/models/tree/develop/PaddleCV/ocr_recognition

gavin1332 commented 5 years ago

@wanghaoshuang 有用户咨询ocr模型的问题,我看是报的windows平台的错误,请问现在ocr模型是否支持windows平台?如果支持的话,这个问题辛苦解答一下

wanghaoshuang commented 5 years ago

@githubusr1 非常感谢反馈。

Tensor holds the wrong type,

这个log的意思是某个op的input的类型不符合预期,应该是需要int64, 但是给的是int32.

但是window环境下没有给出调用栈,很难直观判断是哪个op的问题。

请问,你有修改models下的源码么?用的GPU还是CPU?

githubusr1 commented 5 years ago

没有修改源码,用的是CPU, 还需要我提供哪些信息请说,我马上提供给你。我试着改label数据为int64(原为int32),还是同样错误提示。

wanghaoshuang commented 5 years ago

@githubusr1 我找到原因了:

https://github.com/PaddlePaddle/models/blob/develop/PaddleCV/ocr_recognition/crnn_ctc_model.py#L194

 error_evaluator = fluid.evaluator.EditDistance(
        input=decoded_out, label=casted_label)

EditDistance里调用了elementwise_sub op, 如下:


        compare_result = layers.equal(distances, zero)
        compare_result_int = layers.cast(x=compare_result, dtype='int')
        seq_right_count = layers.reduce_sum(compare_result_int)
        instance_error_count = layers.elementwise_sub(
            x=seq_num, y=seq_right_count)

其中,seq_num是int64的,但是seq_right_count是int32的,应该是elementwise_subreduce_sum的实现最近有更新,从而触发了这个之前没遇到过的问题。

临时解决办法

在你的python lib下用find命令找到evaluator.py这个文件,并修改这一行的int为int64:

https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/fluid/evaluator.py#L268

wanghaoshuang commented 5 years ago

我进一步确认问题后,再给出更合理的解决方案和计划,多谢理解。

githubusr1 commented 5 years ago

改了之后,可以运行了,非常感谢。以后有更好的解决方案和计划,如果这个帖子关闭后不能发消息了,就发到我的邮箱: wzcj2016@126.com, 谢谢。

daniellibin commented 5 years ago

在win10用CPU跑train.py报错 RuntimeError: E:/release_cuda87/build_python35_CPU_MKL/third_party/install/warpctc/lib/warpctc.dll not found. 请问是什么问题? @githubusr1 @gavin1332 @wanghaoshuang @adaxi123 @Superjomn 非常感谢各位的解答