PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.3k stars 5.62k forks source link

运行场景文字识别时该行image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)报错 #8103

Closed yeyupiaoling closed 6 years ago

yeyupiaoling commented 6 years ago

我仿照场景文字识别写了一个项目,没有太大的改动,只是把数据集换成了验证码,验证码是8位的,原来场景文字识别的图像是32位的,我猜是这个的问题,我要如何修改才可以正常运行 日志如下:

OpenCV Error: Assertion failed (scn == 3 || scn == 4) in cvtColor, file /io/opencv/modules/imgproc/src/color.cpp, line 9748
Traceback (most recent call last):
  File "train.py", line 87, in <module>
    train(train_file_list_path, test_file_list_path, label_dict_path, model_save_dir)
  File "train.py", line 79, in train
    num_passes=1000)
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/trainer.py", line 146, in train
    for batch_id, data_batch in enumerate(reader()):
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/minibatch.py", line 33, in batch_reader
    for instance in r:
  File "/usr/local/lib/python2.7/dist-packages/paddle/v2/reader/decorator.py", line 67, in data_reader
    for e in reader():
  File "/home/work/TestDuanDaoDuan/reader.py", line 30, in reader
    yield self.load_image(image_path), label
  File "/home/work/TestDuanDaoDuan/reader.py", line 54, in load_image
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.error: /io/opencv/modules/imgproc/src/color.cpp:9748: error: (-215) scn == 3 || scn == 4 in function cvtColor
ranqiu92 commented 6 years ago

https://github.com/PaddlePaddle/models/blob/develop/scene_text_recognition/reader.py#L48 这里应该是将原图像转化为灰度图像 再转化为一维向量。 你根据自己图像数据格式 进行相应的处理,处理成一维向量即可。

yeyupiaoling commented 6 years ago

@ranqiu92 我在https://github.com/PaddlePaddle/models/blob/develop/scene_text_recognition/reader.py#L55 改成了下面这样,还是不行

image = load_image(path,is_color=False)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ranqiu92 commented 6 years ago

从报错信息看,是与 cv2.cvtColor 有关,建议了解下这个的用法。

yeyupiaoling commented 6 years ago

@ranqiu92 我使用了以下的方法,你说可以不

    def load_images(file):
        # 对图进行灰度化处理
        im = Image.open(file).convert('L')
        # 缩小到跟训练数据一样大小
        im = im.resize((28, 28), Image.ANTIALIAS)
        im = np.array(im).astype(np.float32).flatten()
        im = im / 255.0
        return im

这个没使用到cv2的包,但是却报以下错误

F0203 19:01:32.552273 22041 BlockExpandOp.cpp:73] Check failed: seqLength == outputHeight * outputWidth (2 vs. 18446744073709551606) 
*** Check failure stack trace: ***
    @     0x7f41cedc2bcd  google::LogMessage::Fail()
    @     0x7f41cedc667c  google::LogMessage::SendToLog()
    @     0x7f41cedc26f3  google::LogMessage::Flush()
    @     0x7f41cedc7b8e  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f41ceb3b016  paddle::BlockExpandFunction::getColShape()
    @     0x7f41ceb3d118  paddle::BlockExpandForward<>::calc()
    @     0x7f41ceab96ef  paddle::BlockExpandLayer::forward()
    @     0x7f41ceaccf4d  paddle::NeuralNetwork::forward()
    @     0x7f41ceacdc63  paddle::GradientMachine::forwardBackward()
    @     0x7f41ced9ea04  GradientMachine::forwardBackward()
    @     0x7f41ce96a689  _wrap_GradientMachine_forwardBackward
    @           0x4cb755  PyEval_EvalFrameEx
    @           0x4c2705  PyEval_EvalCodeEx
    @           0x4ca7df  PyEval_EvalFrameEx
    @           0x4c2705  PyEval_EvalCodeEx
    @           0x4ca088  PyEval_EvalFrameEx
    @           0x4c2705  PyEval_EvalCodeEx
    @           0x4ca088  PyEval_EvalFrameEx
    @           0x4c2705  PyEval_EvalCodeEx
    @           0x4ca7df  PyEval_EvalFrameEx
    @           0x4c2705  PyEval_EvalCodeEx
    @           0x4c24a9  PyEval_EvalCode
    @           0x4f19ef  (unknown)
    @           0x4ec372  PyRun_FileExFlags
    @           0x4eaaf1  PyRun_SimpleFileExFlags
    @           0x49e208  Py_Main
    @     0x7f41f58a7830  __libc_start_main
    @           0x49da59  _start
    @              (nil)  (unknown)
Aborted (core dumped)
ranqiu92 commented 6 years ago

这个报错似乎与 https://github.com/PaddlePaddle/models/blob/develop/scene_text_recognition/network_conf.py#L54 有关。 你对代码 以及所配置的参数 做了哪些修改?

yeyupiaoling commented 6 years ago

@ranqiu92 把 https://github.com/PaddlePaddle/models/blob/c6733816ba861e70fc4af58ca4d84ac18adabf23/scene_text_recognition/reader.py#L48-L64 修改成

def load_images(file):
        # 对图进行灰度化处理
        im = Image.open(file).convert('L')
        # 缩小到跟训练数据一样大小
        im = im.resize((72,27), Image.ANTIALIAS)
        im = np.array(im).astype(np.float32).flatten()
        im = im / 255.0
        return im

修改

image_shape = (72,27)

这个是我验证码的大小,宽72,高27

如果以上都恢复原来的设置,再放回原来的数据,可以正常训练

yeyupiaoling commented 6 years ago

@ranqiu92 我有个疑问, https://github.com/PaddlePaddle/models/blob/develop/scene_text_recognition/network_conf.py#L33-L37

        self.image = layer.data(
            name='image',
            type=paddle.data_type.dense_vector(self.image_vector_size),
            height=self.shape[0],
            width=self.shape[1])

paddle.data_type.dense_vector这里的参数为什么没有乘3,3通道的彩色图,在图像分类中是要乘3的。 这里不用吗?

yeyupiaoling commented 6 years ago

@ranqiu92 我懂上面的问题了

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

这个就已经做成灰度图了,所以通道数1.

最后还有个问题

    def load_image(self, path):
        image = load_image(path)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

        # Resize all images to a fixed shape.
        if self.image_shape:
            image = cv2.resize(
                image, self.image_shape, interpolation=cv2.INTER_CUBIC)

        image = image.flatten() / 255.
        return image

def load_images(file):
        # 对图进行灰度化处理
        im = Image.open(file).convert('L')
        # 缩小到跟训练数据一样大小
        im = im.resize((72,27), Image.ANTIALIAS)
        im = np.array(im).astype(np.float32).flatten()
        im = im / 255.0
        return im

的效果一样吗?

yeyupiaoling commented 6 years ago

@ranqiu92 我的图像是索引颜色的,如何修改一下代码

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
yeyupiaoling commented 6 years ago

灰度图先要去掉

image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

然后设置

image = load_image(path, is_color=False)

上面的报错是图像本身的问题