Open HAMA-DL-dev opened 1 year ago
Hi, as far as I've experienced, mmaction uses [x1, y1, x2, y2] format in most of its implementations.
Sorry for not reading description here. I read this just after uploaded this issue and modified annotation data. But nothing had been changed after training modified annotations.
The task that I make use of tail light detection is spatio-temporal action detection. So I inference a custom trained model refer to this documentation.
There is need to configure '--det-config' and '--det-checkpoint' and I configure this options default. But my model focus on detecting car, especially its tail light. So I suspect the bad result problem cause of incorrect configurations, '--det-config' and '--det-checkpoint'. Do I need another custom training for two configurations via mmdetection?
@HAMA-DL-dev , i think in order to do spatial temporal inference on your custom dataset(car-tail-light), you will need two models(2config files), one for detection from mmdetect and another from mmaction2.
--det-config
should points to your mmdetect config file and --det-checkpoint
to the mmdetect checkpoint file.
From what i can deduce from demo_spatial_temporal_det.py
code, action detection should happen only when detection happened.
@eliethesaiyan
Thanks to your advise. I tried to train mmdetection
based on by custom dataset. The result looks nicer than before. Though there is still some problems(e.g., overfitting), I can fix this by modifying dataset. After I succeed custom train perfectly, I will let you know the custom train process and the purpose of using proposal value (cause I check your comment asking a usage of proposal file)
Check List
I have read related issues, such as 'Worse results after train on custom classes?', but cannot get the expected help.
my custom dataset and configuration
I uploaded these a day before on issue
Result
Yesterday, I succeed to inference walking pedestrian video on SlowFast abd SlowOnly config based on AVA dataset. But fail to inference tail light video on pre-trained model based pm my custom dataset.
To summarize, the location of predicted bounding box was weird and another inference output video has no bounding box which means there is no object detected. CSV file seems to be normal, but I want to get your advice out the result below.
part of csv
Trial
Using command $ python demo/demo_spatiotemporal_det.py --video {inference_video.mp4} --configs configs/{my_config} --checkpoint {my_checkpoint.pth} --det-score-thr 0.8 --action-score-thr 0.5 --label-map ${my_data_annotaions}/label_map.txt --predict-stepsize 8 --output-stepsize 4 --output-fps 6
Questions and my suspects
I did not used --validate option when I train model cause of error : AttribueError at ${mmaction_envs}/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 83, in getattr . Is this can be result on bad training output?
Is bounding box value format [x1, y1, x2, y2] or [x_center, y_center, w, h] ? My proposal file contains bounding box information not only coordinates of bbox but also confidence value and I get those using YOLOv5 with '--save-txt' and '--save-conf' options. The model returns text file of bounding box and its format follows latter, [x_center, y_center, w, h].