open-mmlab / mmocr

OpenMMLab Text Detection, Recognition and Understanding Toolbox
https://mmocr.readthedocs.io/en/dev-1.x/
Apache License 2.0
4.22k stars 739 forks source link

Errors and warnings occurred when training totaltext with DB #759

Open ahsdx opened 2 years ago

ahsdx commented 2 years ago

Reproduction

  1. What command or script did you run?
python tools/test.py configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_totaltext.py experiments/dbnet_r50_totaltext/epoch_25.pth --eval hmean-iou
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

    The main changes are as follows: (configs/base/det_models/dbnet_r50dcnv2_fpnc.py) bbox_head=dict(type='DBHead',​ in_channels=256,​ loss=dict(type='DBLoss', alpha=5.0, beta=10.0, bbce_loss=True), postprocessor=dict(type='DBPostprocessor', text_repr_type='poly')), Other changes are made according to the document

  2. What dataset did you use? total_text

Environment

I don't think the problem has anything to do with the environment

Error traceback

**VisibleDeprecationWarning:**/home/yj/桌面/codes/mmocr-0.4.0/mmocr/models/textdet/postprocess/utils.py:39: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  expanded = np.array(offset.Execute(distance))

**AssertionError:**  Evaluateing data/totaltext/instances_test.json with 300 images now
Traceback (most recent call last):
  File "tools/test.py", line 311, in <module>
    main()
  File "tools/test.py", line 307, in main
    print(dataset.evaluate(outputs, **eval_kwargs))
  File "/home/yj/anaconda3/envs/pytorch/lib/python3.7/site-packages/mmdet/datasets/dataset_wrappers.py", line 104, in evaluate
    results_per_dataset, logger=logger, **kwargs)
  File "/home/yj/桌面/codes/mmocr-0.4.0/mmocr/datasets/icdar_dataset.py", line 157, in evaluate
    rank_list=rank_list)
  File "/home/yj/桌面/codes/mmocr-0.4.0/mmocr/core/evaluation/hmean.py", line 108, in eval_hmean
    gts, gts_ignore = get_gt_masks(ann_infos)
  File "/home/yj/桌面/codes/mmocr-0.4.0/mmocr/core/evaluation/hmean.py", line 65, in get_gt_masks
    assert len(mask[0]) >= 8 and len(mask[0]) % 2 == 0
AssertionError

Bug fix For VisibleDeprecationWarning, I change expanded = np.array(offset.Execute(distance)) to expanded = np.array(offset.Execute(distance), dtype=object)

For AssertionError, I replace assert len(mask[0]) >= 8 and len(mask[0]) % 2 == 0 with the following code:

if len(mask[0]) < 8:
  continue
assert len(mask[0]) >= 8 and len(mask[0]) % 2 == 0

Then, the model is trained and evaluated normally. But I want to ask you whether such a change is reasonable and whether it will damage the performance of the model

ahsdx commented 2 years ago

Looking forward to your reply, thank you!

gaotongxiao commented 2 years ago

It's a good workaround though it throws out some ground truth instances and may slightly affect the final model's performance, depending on the number of these "invalid masks". However, I feel like assert len(mask[0]) >= 8 can be relaxed to assert len(mask[0]) >= 6, and similarly https://github.com/open-mmlab/mmocr/blob/75d32504e002f7da3c38c04babd80182be836339/mmocr/core/evaluation/utils.py#L133 can be changed to

assert (points.size % 2 == 0) and (points.size >= 6) 

so that all training data can be retained.

gaotongxiao commented 2 years ago

BTW, thanks for the good catch! Please let us know whether it works for you or not. We want to evaluate the effect of the relaxation and may implement it in our later releases.

ahsdx commented 2 years ago

BTW, thanks for the good catch! Please let us know whether it works for you or not. We want to evaluate the effect of the relaxation and may implement it in our later releases.

Thank you for your answer. I'll try it and tell you the experimental results.

ahsdx commented 2 years ago

It's a good workaround though it throws out some ground truth instances and may slightly affect the final model's performance, depending on the number of these "invalid masks". However, I feel like assert len(mask[0]) >= 8 can be relaxed to assert len(mask[0]) >= 6, and similarly

https://github.com/open-mmlab/mmocr/blob/75d32504e002f7da3c38c04babd80182be836339/mmocr/core/evaluation/utils.py#L133

can be changed to

assert (points.size % 2 == 0) and (points.size >= 6) 

so that all training data can be retained.

I did some experiments. Here are the results. When use

if len(mask[0]) < 8:
    continue
assert len(mask[0]) >= 8 and len(mask[0]) % 2 == 0

and https://github.com/open-mmlab/mmocr/blob/75d32504e002f7da3c38c04babd80182be836339/mmocr/core/evaluation/utils.py#L133

, result is {'0_hmean-iou:recall': 0.7208022021234762, '0_hmean-iou:precision': 0.8328032712403453, '0_hmean-iou:hmean': 0.7727655986509274}

When use

len(mask[0]) >= 6 and len(mask[0]) % 2 == 0

and

assert (points.size % 2 == 0) and (points.size >= 6) 

, result is {'0_hmean-iou:recall': 0.7199528672427337, '0_hmean-iou:precision': 0.8328032712403453, '0_hmean-iou:hmean': 0.7722772277227723}

There is no difference between the two changes.

For better results, as shown in the table below,I change the unclip_ratio value.


unclip_ratio | hmean(totaltext、DBNet_r50) -- | -- 1.5 | 75.2% 1.9 | 77.2%

The hmean of totaltext here is much lower than that in the paper(77.2% vs 84.7%).The same problem also appears on the ctw1500.(76.1% vs 83.4%). Here are all changes I made:

  1. learning_rate = 0.007 / 2
  2. batch_size = 8
  3. text_repr_type='poly'

I can't find out what the problem is. And I would appreciate it if you could give me some advice..

ming-eng commented 2 years ago

total-text的效果很差怎么办

ahsdx commented 2 years ago

我的效果也是很差,比论文低了好多,目前还未找到原因。

---Original--- From: @.> Date: Fri, Apr 8, 2022 21:43 PM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when trainingtotaltext with DB (Issue #759)

total-text的效果很差怎么办

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

好的,感谢你的回复 total-text使用dbnet你跑的最好的精度十多 我ic15使用resnet50的精度是80.1

ming-eng commented 2 years ago

我使用fcenet单卡训练的 过程中 损失函数在十几轮的情况下是正常的 然后损失函数突然为nan

ahsdx commented 2 years ago

抱歉,fcenet训练totaltext我没有尝试过,只尝试过DB的

---Original--- From: @.> Date: Sat, Apr 9, 2022 08:36 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when trainingtotaltext with DB (Issue #759)

我使用fcenet单卡训练的 过程中 损失函数在十几轮的情况下是正常的 然后损失函数突然为nan

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ahsdx commented 2 years ago

比你,78的样子(batchsize=8,lr=0.0035,预训练模型为synthtext,单卡训练),请问你是怎么训练的呢?方便告知一下吗,我想看看会不会是由于训练策略的原因导致的。

---Original--- From: @.> Date: Sat, Apr 9, 2022 08:35 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when trainingtotaltext with DB (Issue #759)

好的,感谢你的回复 total-text使用dbnet你跑的最好的精度十多 我ic15使用resnet50的精度是80.1

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

我用的是默认文件,没有预训练。我只训练了大概600次,好像就饱和了

ming-eng commented 2 years ago

image

ming-eng commented 2 years ago

image

ming-eng commented 2 years ago

image

ahsdx commented 2 years ago

好的谢谢,我训练的时候也是这样,很快就达到饱和了,不过我比起你的更快一点。可能由于我的batchsize比你小。为什么totaltext,ctw1500等弯曲文本数据集效果这么差,我问过官方,官方那边也没有解决。

---Original--- From: @.> Date: Sat, Apr 9, 2022 10:06 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when trainingtotaltext with DB (Issue #759)

我用的是默认文件,没有预训练。我只训练了大概600次,好像就饱和了

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

我现在有点无语的是,为啥有人不用预训练能跑83的效果,感觉和闹着玩一样

ming-eng commented 2 years ago

采用的是非变形卷积,Imagenet预训练,无加载systh训练

ahsdx commented 2 years ago

有人能跑到83吗?不会吧,我没看到诶

---Original--- From: @.> Date: Sat, Apr 9, 2022 10:55 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when training totaltext with DB (Issue #759)

我现在有点无语的是,为啥有人不用预训练能跑83的效果,感觉和闹着玩一样

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

是真的 我不知道他使用的 是哪个代码 在一篇论文里面看到的 你可以去搜一下

ahsdx commented 2 years ago

好的谢谢

---Original--- From: @.> Date: Sat, Apr 9, 2022 11:09 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when training totaltext with DB (Issue #759)

是真的 我不知道他使用的 是哪个代码 在一篇论文里面看到的 你可以去搜一下

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

不用客气,我现在跑了好几个dbnet的代码了

ming-eng commented 2 years ago

请问你跑ctw1500了吗

ahsdx commented 2 years ago

跑了,和totaltext基本是一样的效果

---Original--- From: @.> Date: Sat, Apr 9, 2022 11:41 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when training totaltext with DB (Issue #759)

请问你跑ctw1500了吗

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

是因为数据集的问题吗转换出错之类的?

ahsdx commented 2 years ago

没有,数据集转换是正常的,训练出来得效果和totaltext效果差不多,而我也是很早就收敛了。

---Original--- From: @.> Date: Sat, Apr 9, 2022 11:45 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when training totaltext with DB (Issue #759)

是因为数据集的问题吗转换出错之类的?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

这是我使用链接你看一下 https://drive.google.com/file/d/1YbohYSs4T6yyVMEYCpr18fzKiUWzYVOe/view

ming-eng commented 2 years ago

那你有没有使用pannet跑过ctw1500嘞

ahsdx commented 2 years ago

没有,只跑过db

---Original--- From: @.> Date: Sat, Apr 9, 2022 12:03 PM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when training totaltext with DB (Issue #759)

那你有没有使用pannet跑过ctw1500嘞

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ming-eng commented 2 years ago

我觉得你可以使用pannet测试一下,我现在正在测试ic15的 ctw1500的数据集好像有点问题 转换的过程中 标签啥的不太一样

ming-eng commented 2 years ago

image 这个文件好像和我下载的数据集不太一样

ming-eng commented 2 years ago

image

ghost commented 2 years ago

图片

你好,请问能分享一下totaltext的数据集吗?我用官方的转换方式得到的数据集在训练DBNet时出现错误

ahsdx commented 2 years ago
font{
    line-height: 1.6;
}
ul,ol{
    padding-left: 20px;
    list-style-position: inside;
}

    这个是我生成的totaltext标签文件。至于数据集图片的话,你按照教程下载就行。

                ***@***.***

On 7/13/2022 ***@***.***> wrote: 

你好,请问能分享一下totaltext的数据集吗?我用官方的转换方式得到的数据集在训练DBNet时出现错误

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

ghost commented 2 years ago

list-style-position: inside;

Hi, can you email me the TotalText dataset? I can't find the label file. Thank you again. (2109589324@qq.com)

YanHao22 commented 1 year ago

2022-09-30 12:21:03,745 - mmocr - INFO - Epoch(val) [20][300] 0_hmean-iou:recall: 0.6971, 0_hmean-iou:precision: 0.8369, 0_hmean-iou:hmean: 0.7606 我就用mmocr里面提供的dbnet脚本跑的total-text,下载了github上给的res18预训练模型,又train了20个epoch,最后f-score(和hmean是一个东西吧?)是0.7606,感觉和论文里的83%差得远了啊,而且速度也很慢,有没有什么训练注意的地方呢?比如参数设置之类的?

ahsdx commented 1 year ago

好的谢谢你

---Original--- From: @.> Date: Sat, Apr 9, 2022 10:56 AM To: @.>; Cc: @.**@.>; Subject: Re: [open-mmlab/mmocr] Errors and warnings occurred when trainingtotaltext with DB (Issue #759)

采用的是非变形卷积,Imagenet预训练,无加载systh训练

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ahsdx commented 1 year ago
font{
    line-height: 1.6;
}
ul,ol{
    padding-left: 20px;
    list-style-position: inside;
}

    这个我不太清楚,我是按照官方说的下载数据集和标签的,格式什么的没有关注

                ***@***.***

On 4/9/2022 ***@***.***> wrote: 

这个文件好像和我下载的数据集不太一样

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>