Closed zhujiagang closed 5 years ago
Hi thanks for your questions! Yes, I think using downsampled frames could affect performance. We train with short side as large as 320, and test with short size 256 pixels.
Does this always happen when testing the 2nd crop? It might be worth trying to rerun and see if it's something stochastic or transient. It might be also worth trying to reduce the number of processes to use (https://github.com/facebookresearch/video-long-term-feature-banks/blob/master/lib/datasets/dataloader.py#L75). Let me know how it goes.
Thank you for your suggestions. I am sure it is not stochastic. Follow your advices, I reduce the number of processes to 1. Now the code goes well until the 18-th (2x3x3) crop, without speed reduction.
[INFO: checkpoints.py: 401]: Broadcasting gpu_0/lfb_nl2_out_b to
[INFO: checkpoints.py: 406]: |-> gpu_1/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_2/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_3/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_4/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_5/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_6/lfb_nl2_out_b
[INFO: checkpoints.py: 406]: |-> gpu_7/lfb_nl2_out_b
[INFO: checkpoints.py: 401]: Broadcasting gpu_0/pred_w to
[INFO: checkpoints.py: 406]: |-> gpu_1/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_2/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_3/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_4/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_5/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_6/pred_w
[INFO: checkpoints.py: 406]: |-> gpu_7/pred_w
[INFO: checkpoints.py: 401]: Broadcasting gpu_0/pred_b to
[INFO: checkpoints.py: 406]: |-> gpu_1/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_2/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_3/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_4/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_5/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_6/pred_b
[INFO: checkpoints.py: 406]: |-> gpu_7/pred_b
[I net_async_base.h:205] Using specified CPU pool size: 32; device id: -1
[I net_async_base.h:210] Created new CPU pool, size: 32; device id: -1
terminate called after throwing an instance of 'std::system_error'
what(): Resource temporarily unavailable
/running_package/video-long-term-feature-banks/job.sh: line 46: 1250 Aborted (core dumped) ${CMD}
I can see the mAP of previous 17 crops. Perhaps because the workstations we use are different. Now I decide to use 2 scales (2x2x3=12) to get the final result.
Glad that you found a workaround! Let me know if you have further questions.
Thanks for sharing your excellent work and code! I've run the code using the provided trained model (
ava_r101_lfb_nl_3l
) for evaluating on the validation set of AVA 2.1. For the single crop testing, mAP is 25.1 vs your 26.9, perhaps because I use a downsampled videos to extract frames (short size 240). When trying multi crop testing by setAVA.TEST_MULTI_CROP
asTrue
, as you suggested, I got the following errors.The trained model can get performance of 23.6 in the first crop:
AVA results wrote to detections_final_224_shift0_0.850.csv
. It seems when combining the second crop causes the errors.