yjh0410 / YOWOv2

The second generation of YOWO action detector.
MIT License
207 stars 32 forks source link

在ava数据集上训练10epoch后验证时精度一直是0mAp #9

Open YRVGFO9588 opened 1 year ago

Batman-97 commented 1 year ago

是的,即使我有同样的问题。所以我正在尝试调试代码并了解问题所在。

我认为这是边界框类中的问题。

YRVGFO9588 commented 1 year ago

是的,即使我有同样的问题。所以我正在尝试调试代码并了解问题所在。

我认为这是边界框类中的问题。

请问现在查到问题了吗

yjh0410 commented 1 year ago

@YRVGFO9588 @Batman-97 在我的测试环境下,没有出现这个问题, 都能获得有效的mAP值。可能是数据集的格式导致了你们的mAP都是0的问题。

YRVGFO9588 commented 1 year ago

@YRVGFO9588 @Batman-97 在我的测试环境下,没有出现这个问题, 都能获得有效的mAP值。可能是数据集的格式导致了你们的mAP都是0的问题。

在训练过程中,我发现分类损失一直降不下去,维持在2.3左右,边界框损失在0.15左右,这是训练了50轮的结果,是否方便添加微信指导下哈

yjh0410 commented 1 year ago

@YRVGFO9588 cls loss在1.2-2.0的范围内,reg_loss在0.2~0.4范围内很正常。

YRVGFO9588 commented 1 year ago

@YRVGFO9588 @Batman-97 在我的测试环境下,没有出现这个问题, 都能获得有效的mAP值。可能是数据集的格式导致了你们的mAP都是0的问题。

或者是否能给一直数据集的示范格式,不用全部可以都缩减到一个视频那种形式

YRVGFO9588 commented 1 year ago

@YRVGFO9588 cls loss在1.2-2.0的范围内,reg_loss在0.2~0.4范围内很正常。

这个是训练多少epoch的损失啊

yjh0410 commented 1 year ago

@YRVGFO9588 数据都在服务器上,取一次很麻烦,不方便提供,AVA的数据集准备方法已经在README中提供了。你的loss情况看起来没有问题,也不需要训练50epoch那么久。建议先使用本项目提供的已训练好的YOWOv2模型,并在测试集上复现出性能指标,以确保数据格式都已准备正确。

Batman-97 commented 1 year ago

@yjh0410 你好,

这堂课是做什么的??

class BoundingBox:
    def __init__(self,
                 imageName,
                 classId,
                 x,
                 y,
                 w,
                 h,
                 typeCoordinates=None,
                 imgSize=None,
                 bbType=None,
                 classConfidence=None,
                 format=None):

我在程序中添加了一个常量 (epsilon = 1e-5) (YOWOv2/evaluator /cal_frame_mAP.py)。由于 TP 和 FP 为零,我在初始训练步骤中遇到错误。

 # compute precision, recall and average precision
            acc_FP = np.cumsum(FP)
            acc_TP = np.cumsum(TP)
            rec = acc_TP / (npos + epsilon)
            prec = np.divide(acc_TP, (acc_FP + acc_TP + epsilon))
Batman-97 commented 1 year ago

@yjh0410 你好,

我使用了 ucf24 的数据集格式,但我有两个动作类。我的数据集在一帧中有两个动作,但训练运行良好,从 11.3 开始作为总错误,现在在第 5 个时期它的总损失为 6.54。我应该从这个推论中得到什么?并且在每个时期,一个类的验证平均精度为 0,而其他类具有一定的价值,但它的值非常低,约为 5%。

谢谢你提前

YCA-eng commented 1 year ago

@YRVGFO9588 您好 请问您的AVA数据集下载下来了吗

vron8632 commented 5 months ago

我用自己的数据集一直mAP是0,请问是数据集制作的不对吗?但我是按照ava2.2来制作的啊,能QQ(我的是393974615)有偿指导一下么 ?

tapohongchen commented 5 months ago

我同样遇到了这个问题,在测试的时候一切正常,test的时候会出现0map,我去debug对比了一下发现test的时候的reg_pred的值普遍在0.1以下,比train的时候小了10倍的感觉,我感觉错误应该是在这里。但是到该部分时候的代码是一样的,我不知道为什么会出现这样的问题。第一张图是train的时候的reg_pred,第二张是test时候的。 train test

AKASH2907 commented 1 month ago

Was this problem resolved? I'm facing the same issues as @tapohongchen. Even using the pretrained weights, we are getting these f-mAP scores: map@0.5 - 0.021 (medium) map@0.5 - 0.014 (nano)

It's off by 1 decimal. Is this normal or there's some error?

tapohongchen commented 1 month ago

Was this problem resolved? I'm facing the same issues as @tapohongchen. Even using the pretrained weights, we are getting these f-mAP scores: map@0.5 - 0.021 (medium) map@0.5 - 0.014 (nano)

It's off by 1 decimal. Is this normal or there's some error?

yeap, I found that the reason is 'conf_thresh', I went to debug and found that 0.1 is too high, replaced by 0.005 (the specific value of the hyperparameter you can debug yourself)

sirsh07 commented 1 month ago

Was this problem resolved? I'm facing the same issues as@tapohongchen. Even using the pretrained weights, we are getting these f-mAP scores: map@0.5 - 0.021 (medium) map@0.5 - 0.014 (nano) It's off by 1 decimal. Is this normal or there's some error?

yeap, I found that the reason is 'conf_thresh', I went to debug and found that 0.1 is too high, replaced by 0.005 (the specific value of the hyperparameter you can debug yourself)

when running evaluation with pretrained AVA weights, i obtained these results. i tried using a confidence threshold of 0.005, but the output remained the same. any idea what might be causing this?

(smenv_v2) sirsh:~/YOWOv2$ python eval.py --cuda -d ava_v2.2 -v yowo_v2_medium -bs 16 --weight /home/sirsh/semi_stn/newyowo/YOWOv2/weights/yowo_v2_medium_ava.pth -ct 0.005
use cuda
==============================
Dataset Config: AVA_V2.2 
==============================
Model Config: YOWO_V2_MEDIUM 
==============================
Build YOWO_V2_MEDIUM ...
==============================
2D Backbone: YOLO_FREE_LARGE
--pretrained: False
==============================
FPN: pafpn_elan
==============================
Head: Decoupled Head
==============================
Head: Decoupled Head
==============================
Head: Decoupled Head
==============================
3D Backbone: SHUFFLENETV2
--pretrained: False
==============================
Head: Decoupled Head
==============================
Head: Decoupled Head
==============================
Head: Decoupled Head
Finished loading model!
Finished loading image paths from: /home/c3-0/datasets/ava_v2/AVA_Dataset/frame_lists/val.csv
Finished loading image paths from: /home/c3-0/datasets/ava_v2/AVA_Dataset/frame_lists/val.csv
Finished loading annotations from: /home/c3-0/datasets/ava_v2/AVA_Dataset/annotations/ava_v2.2/ava_val_v2.2.csv
Number of unique boxes: 21082
Number of annotations: 54667
=== AVA dataset summary ===
Train: False
Number of videos: 58
Number of frames: 1567751
Number of key frames: 11360
Number of boxes: 21082.
/home/sirsh/smenv_v2/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[0 / 710]
[100 / 710]
[200 / 710]
[300 / 710]
[400 / 710]
[500 / 710]
[600 / 710]
[700 / 710]
Evaluating with 11360 unique GT frames.
Evaluating with 11360 unique detection frames
{ 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/answer phone': 0.03360534550284231,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/bend/bow (at the waist)': 0.017021509374934744,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/carry/hold (an object)': 0.04933759872247631,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/climb (e.g., a mountain)': 0.0,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/close (e.g., a door, a box)': 0.013263055950852866,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/crouch/kneel': 0.03126368869019352,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/cut': 0.0007249965921216991,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/dance': 0.3304812494122996,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/dress/put on clothing': 0.0011319480385372834,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/drink': 0.0008230892191382596,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/drive (e.g., a car, a truck)': 0.017637193070371985,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/eat': 0.016180535013938183,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/enter': 0.002648826549849559,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/fall down': 0.0022827852996272695,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/fight/hit (a person)': 0.0037453699404711794,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/get up': 0.0013529197247960404,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/give/serve (an object) to (a person)': 0.0017127284922080824,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/grab (a person)': 0.0018730445914046383,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand clap': 0.004119022057361893,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand shake': 0.0011285156957915805,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand wave': 0.0003226532537776459,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hit (an object)': 0.0008206562018924411,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hug (a person)': 0.0034877073563019494,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/jump/leap': 8.153077746258498e-05,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/kiss (a person)': 0.003446317879569093,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lie/sleep': 0.01772371239758181,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lift (a person)': 0.0017785269632455637,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lift/pick up': 0.004818065828211607,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/listen (e.g., to music)': 0.0022780665205551043,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/listen to (a person)': 0.0707498616132757,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/martial art': 0.00042850102858999877,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/open (e.g., a window, a car door)': 0.004485382651005732,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/play musical instrument': 0.006072168551350378,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/point to (an object)': 7.014590347923681e-06,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/pull (an object)': 0.00037779099498460916,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/push (an object)': 0.007631318187969781,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/push (another person)': 0.00020091573784744378,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/put down': 0.010051323849379418,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/read': 0.008129381489540068,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/ride (e.g., a bike, a car, a horse)': 0.011424029952804372,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/run/jog': 0.07117408643656717,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sail boat': 0.0,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/shoot': 2.389828888251601e-05,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sing to (e.g., self, a person, a group)': 0.05397957771045906,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sit': 0.06780862204771411,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/smoke': 0.001653182868698183,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/stand': 0.1273598059575706,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/swim': 0.0028074876861856495,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/take (an object) from (a person)': 0.0005990651636672349,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/take a photo': 0.00023649302975024212,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/talk to (e.g., self, a person, a group)': 0.11492799024171699,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/text on/look at a cellphone': 0.0007356242927204215,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/throw': 0.00028059525493451975,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/touch (an object)': 0.009549086770481151,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/turn (e.g., a screwdriver)': 0.007597487258504208,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/walk': 0.046912638484303854,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/watch (a person)': 0.08333139695834825,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/watch (e.g., TV)': 0.009243562086648862,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/work on a computer': 0.0009004171031601323,
  'PascalBoxes_PerformanceByCategory/AP@0.5IOU/write': 0.0024004279921045722,
  'PascalBoxes_Precision/mAP@0.5IOU': 0.0214361632232888}
Save eval results in results/ava_v2.2/ava_detections.json
AVA eval done in 98.005740 seconds.
mAP: 0.0214361632232888