Open happyjin opened 5 years ago
Hi @happyjin
12 epochs as specified in https://github.com/open-mmlab/mmaction/blob/master/configs/ava/ava_fast_rcnn_nl_r50_c4_1x_kinetics_pretrain_crop.py.
You can use the default setting file to get the score. If you see any difference, let me know in this issue.
Lr should be proportional to your effective number of sample. In your case, try halve your lr and see if it works out.
Hi @zhaoyue-zephyrus Thanks for the amazing repository. I tried with the default settings, but I use 4 gpus. I trained for ~ 2.5 days (12 epochs) with learning rate halved. Unfortunately, I get only 0.04 mAP.
Things I changed: halved learning rate, doubled warmup iterations, doubled warmup steps.
My guess (I have yet to experiment) is that the number of epochs need to be doubled.
@TheShadow29 Sorry I haven't got enough facilities on hand to reproduce that. I will figure it out a little bit later.
@zhaoyue-zephyrus I tried with more epochs (another 10 epochs), the result is still the same map ~ 0.03.
I also noticed the following in the log just before report of the scores:
2019-07-17 10:43:49,392 - INFO - The following classes have no ground truth examples: [ 2 16 18 19 21 23 25 31 32 33 35 39 40 42 44 50 53 55 71 75]
I am not sure how to interpret this.
Also, do you have the log file for the training of ava model? Comparing the logs might reveal something.
Again, cheers for creating such an amazing repository and thank you for your patience.
@TheShadow29
Hi , the line [ 2 16 18 19 21 23 25 31 32 33 35 39 40 42 44 50 53 55 71 75]
means that AVA is evaluating 60 classes out of the whole 80 classes. You don't need to worry about it.
Before we figure out the training issue, could you please first run the testing code to check if you could reproduce the reported results? I will try to reproduce your configuration soon.
@TheShadow29 BTW, what GPU are you using? Do you halve the videos_per_gpu
as well?
Yes, I used the testing code and using your pretrained model from the model zoo I get 21 map.
PascalBoxes_Precision/mAP@0.5IOU= 0.21313359468022483
I am using 4 gpus, each being a 1080Ti. I use videos_per_gpu=2
and workers_per_gpu=2
.
The config file is as it is for the model, data part. Here is the optimizer config:
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=1e-6)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=1000,
warmup_ratio=1.0 / 4,
step=[16, 22])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs = 24
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/ava_fast_rcnn_nl_r50_c4_1x_f32s2_kinetics_pretrain_crop_multiscale'
load_from = None
resume_from = None
workflow = [('train', 1)]
I performed diff between original config and my current config, here are the differences:
lr=0.02
, warmup_iters=1000
, step=[16,22]
, total_epochs=24
How can I achieve 21.3 mAP. I use 8 gpus, cuda 10.0, python 3.7, pytorch 1.1.0 Thank you.
Yes, I used the testing code and using your pretrained model from the model zoo I get 21 map.
PascalBoxes_Precision/mAP@0.5IOU= 0.21313359468022483
I am using 4 gpus, each being a 1080Ti. I use
videos_per_gpu=2
andworkers_per_gpu=2
.The config file is as it is for the model, data part. Here is the optimizer config:
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=1e-6) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) # learning policy lr_config = dict( policy='step', warmup='linear', warmup_iters=1000, warmup_ratio=1.0 / 4, step=[16, 22]) checkpoint_config = dict(interval=1) # yapf:disable log_config = dict( interval=50, hooks=[ dict(type='TextLoggerHook'), # dict(type='TensorboardLoggerHook') ]) # yapf:enable # runtime settings total_epochs = 24 dist_params = dict(backend='nccl') log_level = 'INFO' work_dir = './work_dirs/ava_fast_rcnn_nl_r50_c4_1x_f32s2_kinetics_pretrain_crop_multiscale' load_from = None resume_from = None workflow = [('train', 1)]
I performed diff between original config and my current config, here are the differences:
lr=0.02
,warmup_iters=1000
,step=[16,22]
,total_epochs=24
Have you soveled your problem, i can not reproduce the results 21.3 mAP.. thanks
How can I achieve 21.3 mAP. I use 8 gpus, cuda 10.0, python 3.7, pytorch 1.1.0 Thank you.
hi, Have your ever reproduced the 21.3 mAP results.
How can I achieve 21.3 mAP. I use 8 gpus, cuda 10.0, python 3.7, pytorch 1.1.0 Thank you.
hi, Have your ever reproduced the 21.3 mAP results.
It is not over 16.35 mAP. Have you training?
How can I achieve 21.3 mAP. I use 8 gpus, cuda 10.0, python 3.7, pytorch 1.1.0 Thank you.
hi, Have your ever reproduced the 21.3 mAP results.
It is not over 0.16 mAP. Have you training?
only 0.3~0.4 mAP.. Then I use other network as backbone, it could get about 12 mAP.
Hi, Thanks for your contribution of mmaction which is an awesome open-source project on the GitHub! I have a few questions about the performance of AVA model which is in the model zoo. My questions are: