训练Loss没有下降 - Githubissues

MhLiao / DB

A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".

2.09k stars 479 forks source link

Open 09876qwrte opened 4 years ago

09876qwrte commented 4 years ago

你好，我按照你的规格，训练了自己的数据集，有些参数按照你的要求更改了，超参还是保持跟你一样，训练的过程中发现loss没有下降，请问这会是什么原因导致的，谢谢

09876qwrte commented 4 years ago

另外，我的训练集训练了三个epoch之后的模型，测试效果，发现文字检测的召回特别低，总体效果不好，远没有psenet效果好，请问会是什么原因呢

oceanogeology commented 4 years ago

和这位弟兄遇到了相同的loss下降很少，准确率和召回率都比较低的情况。

MhLiao commented 4 years ago

@09876qwrte 请问你的训练集有多少张图？是用的pre-train的模型finetune的吗？

09876qwrte commented 4 years ago

@MhLiao 加在一起大概5万张左右，用了你提供的pretrain模型finetune的（就是你给的synthtext模型），训练完之后效果特别差，文字召回很低

cqray1990 commented 4 years ago

@09876qwrte what if your data is not belong to the same distibution,why did you load the pretrain model such as synthtext

09876qwrte commented 4 years ago

@leigaoxiang 作者也是这样训练的，我是和作者保持一致

cqray1990 commented 4 years ago

which .yaml file you use,and its pretrain is totaltext_resnet18?

wuzuowuyou commented 4 years ago

训练ic15也是不收敛啊， epoch: 6, loss: 4.869051, lr: 0.006969 降不下去，测试txt全为空

MhLiao commented 4 years ago

@wuzuowuyou 我觉得可以从以下几个方面去排查：

sporterman commented 4 years ago

@MhLiao 你好请问怎么调整检测文字时的间距，有时候想将两个文字间距大一点的文字框到一起，但是这个模型是分开框的？有没有参数可以调整一下

Wangweilai1 commented 4 years ago

遇到了同样的问题，训练数据集是ICDAR2015, gt是对的；只训练一张图像也拟合不了；环境为cuda 10.1， pytorch1.4.0； loss稳定在4左右浮动，不下降。

Debugerss commented 4 years ago

我也遇到了同样的问题，我的loss能够收敛到1.0，但是测试过程中什么也检测不到，训练1300张图片，但是我用github给的模型，能够检测出东西，只是检测效果不好，但是我用自己训练好的模型什么都检测不到。

whereitogo commented 4 years ago

我也遇到了同样的问题，我的loss能够收敛到1.0，但是测试过程中什么也检测不到，训练1300张图片，但是我用github给的模型，能够检测出东西，只是检测效果不好，但是我用自己训练好的模型什么都检测不到。

是的，我的loss也是在4.9xx，然后基本检测不到文字！请问你解决了吗？

Angel113110 commented 4 years ago

同问

Kevin-GJ commented 3 years ago

同问，loss很快就不再收敛了，一直在一个范围内震荡；数据已检查和清洗过，做过对比实验，其他算法如pixellink训练可以收敛，推理效果也不错；DBNet没展现出应有的实力，求甚解

wangxiaofanw commented 1 year ago

遇到了同样的问题，loss在1.0附近震荡，无法继续下降了，求解

lipeng1109 commented 1 year ago

@wuzuowuyou 我觉得可以从以下几个方面去排查：

gt是否是对的

训练一张图片是否能拟合，即训练和测试用同一张图判断是否能拟合

排查是否是cuda和pytorch版本的问题

你好，我训练了模型召回率比较低，一直在0.4左右上不去，还有发现在pytorch=1.10和pytorch=1.12上测试，得到的Hmean相差了5个多点，这个算法对于torch版本真么敏感吗？我的模型是在torch=1.12环境下训练的