可视化训练损失曲线

todesti2 commented 1 year ago

您好！我使用的是autodl云服务器平台，我想知道如何判断在模型训练上是否过拟合，想绘制训练和验证损失曲线弄了好久都没弄出来。期待您的回复!

lyuwenyu commented 1 year ago

看验证集合的评估指标也行

todesti2 commented 1 year ago

看验证集合的评估指标也行

您的意思是看训练时输出的loss值什么的嘛

lyuwenyu commented 1 year ago

不不是看验证集的mAP啊

todesti2 commented 1 year ago

不不是看验证集的mAP啊

谢谢您的回复，但是看验证集的mAP貌似不能判断出是否过拟合呢┭┮﹏┭┮

lyuwenyu commented 1 year ago

关于提的问题：简单的办法是在train mode下跑一遍测试集去得到loss ( 把数据增强换成eval mode下的)

关于过拟合：是用 测试集的loss 还是用 测试集的mAP 去判断，我更倾向于测试集的mAP更有用

todesti2 commented 1 year ago

关于提的问题：在train mode下跑一遍测试集去得到loss ( 把数据增强换成eval mode下的)

关于过拟合：是用 测试集的loss 还是用 测试集的mAP 去判断，我更倾向于测试集的mAP更有用

关于过拟合： index created! 0%| | 0/1610 [00:00<?, ?it/s]W0802 14:45:45.821064 1818 gpu_resources.cc:275] WARNING: device: . The installed Paddle is compiled with CUDNN 8.2, but CUDNN version in your machine is 8.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1610/1610 [01:14<00:00, 21.65it/s] [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json. [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to output/bbox.json and do not evaluate the mAP.

您好，在推理测试集时它没有输出mAP

todesti2 commented 1 year ago

关于提的问题：简单的办法是在train mode下跑一遍测试集去得到loss ( 把数据增强换成eval mode下的)

关于过拟合：是用 测试集的loss 还是用 测试集的mAP 去判断，我更倾向于测试集的mAP更有用

关于提的问题：您好，“在train mode下跑一遍测试集去得到loss”的train mode`是什么意思呀？抱歉有点难理解，在trainner.py里面存在很多model=train之类的语句，有点理不清楚

lyuwenyu commented 1 year ago

目前这套代码是不支持的，因为现在只支持训练输出loss。。上边的建议是就是把训练数据改成测试数据跑一遍

lyuwenyu commented 1 year ago

关于提的问题：在train mode下跑一遍测试集去得到loss ( 把数据增强换成eval mode下的) 关于过拟合：是用 测试集的loss 还是用 测试集的mAP 去判断，我更倾向于测试集的mAP更有用

关于过拟合： index created! 0%| | 0/1610 [00:00<?, ?it/s]W0802 14:45:45.821064 1818 gpu_resources.cc:275] WARNING: device: . The installed Paddle is compiled with CUDNN 8.2, but CUDNN version in your machine is 8.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1610/1610 [01:14<00:00, 21.65it/s] [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json. [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to output/bbox.json and do not evaluate the mAP.

您好，在推理测试集时它没有输出mAP

用tools/eval.py测一下 bbox.json就行了

todesti2 commented 1 year ago

关于提的问题：在train mode下跑一遍测试集去得到loss ( 把数据增强换成eval mode下的) 关于过拟合：是用 测试集的loss 还是用 测试集的mAP 去判断，我更倾向于测试集的mAP更有用

关于过拟合： index created! 0%| | 0/1610 [00:00<?, ?it/s]W0802 14:45:45.821064 1818 gpu_resources.cc:275] WARNING: device: . The installed Paddle is compiled with CUDNN 8.2, but CUDNN version in your machine is 8.1, which may cause serious incompatible bug. Please recompile or reinstall Paddle with compatible CUDNN version. 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1610/1610 [01:14<00:00, 21.65it/s] [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to bbox.json. [08/02 14:47:07] ppdet.metrics.metrics INFO: The bbox result is saved to output/bbox.json and do not evaluate the mAP. 您好，在推理测试集时它没有输出mAP

用tools/eval.py测一下 bbox.json就行了

您好，刚试了试测试这个推理测试集得出的bbox.json，但是却显示了报错 Loading and preparing results... Traceback (most recent call last): File "tools/eval.py", line 199, in main() File "tools/eval.py", line 195, in main run(FLAGS, cfg) File "tools/eval.py", line 130, in run json_eval_results( File "/root/RT-DETR/rtdetr_paddle/ppdet/metrics/coco_utils.py", line 186, in json_eval_results cocoapi_eval(v_json, coco_eval_style[i], anno_file=anno_file) File "/root/RT-DETR/rtdetr_paddle/ppdet/metrics/coco_utils.py", line 107, in cocoapi_eval coco_dt = coco_gt.loadRes(jsonfile) File "/root/miniconda3/envs/paddlepaddle/lib/python3.8/site-packages/pycocotools/coco.py", line 327, in loadRes assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), AssertionError: Results do not correspond to current coco set

我使用的是visdrone数据集，修改某些配置后能跑出还OK 的结果，但是不知道为甚出现数据集不匹配的报错我执行的语句是：python -u tools/eval.py -c configs/rtdetr/rtdetr_hgnetv2_x_6x_coco.yml --json_eval

lyuwenyu commented 1 year ago

缺少的功能或者使用的问题我建议你在paddledet里问一下我们这优先只聊算法相关的

todesti2 commented 1 year ago

缺少的功能或者使用的问题我建议你在paddledet里问一下我们这优先只聊算法相关的

好的！谢谢您！

lyuwenyu / RT-DETR

可视化训练损失曲线 #20