facebookresearch / video-long-term-feature-banks

Long-Term Feature Banks for Detailed Video Understanding
Apache License 2.0
372 stars 62 forks source link

some questions for ava val result #29

Closed luckcodingdog closed 4 years ago

luckcodingdog commented 5 years ago

Hi, thanks for your work! I test the model 'ava_r50_baseline | R50-I3D-NL | 3D CNN | 22.2 | 102760666' with ava val dataset and got the map 20.35 not 22.2. Besides i also test the model 'ava_r101_baseline | R101-I3D-NL | 3D CNN | 23.2 | 102760714 | ' with ava val dataset and got the map 21.6 not 23.2. I used dataset_tools/ava/ for cut_video and extracting frames.
the part config list below: NUM_GPUS: 4 #8 BATCH_SIZE: 4 #16

the result of ava_r101_baseline list below: [INFO: ava_eval_helper.py: 149]: Evaluating with 50250 unique GT frames. [INFO: ava_eval_helper.py: 150]: Evaluating with 52391 unique detection frames { 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/answer phone': 0.6257528901429095, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/bend/bow (at the waist)': 0.3166801822509334, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/carry/hold (an object)': 0.5029330536823506, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/climb (e.g., a mountain)': 0.054609780655019785, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/close (e.g., a door, a box)': 0.10202876673111581, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/crouch/kneel': 0.16645598290916847, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/cut': 0.02002199190449563, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/dance': 0.4825744659510926, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/dress/put on clothing': 0.0468733190313381, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/drink': 0.20589438219829737, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/drive (e.g., a car, a truck)': 0.4084554714434966, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/eat': 0.28239599752680655, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/enter': 0.03786439933681678, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/fall down': 0.07982217877141026, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/fight/hit (a person)': 0.4266387587466585, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/get up': 0.13630186122501567, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/give/serve (an object) to (a person)': 0.06472897845356354, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/grab (a person)': 0.052462974836113486, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand clap': 0.23765609955795167, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand shake': 0.06387645466559444, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hand wave': 0.051350562345756426, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hit (an object)': 0.0032628862264896977, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/hug (a person)': 0.09914906395074286, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/jump/leap': 0.102188041024713, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/kiss (a person)': 0.15806481699191882, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lie/sleep': 0.36017987226002973, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lift (a person)': 0.023263182014367438, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/lift/pick up': 0.013348416675665237, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/listen (e.g., to music)': 0.017687284432060597, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/listen to (a person)': 0.6170145606794128, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/martial art': 0.444075762117532, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/open (e.g., a window, a car door)': 0.1507777171362535, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/play musical instrument': 0.22318718703291573, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/point to (an object)': 0.0007438893706970384, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/pull (an object)': 0.006841743074581377, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/push (an object)': 0.012987789694658898, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/push (another person)': 0.04153379696481397, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/put down': 0.01995781637473908, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/read': 0.25817442548955505, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/ride (e.g., a bike, a car, a horse)': 0.33074209266063204, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/run/jog': 0.44980315743974686, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sail boat': 0.1497165029933184, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/shoot': 0.06722527109538293, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sing to (e.g., self, a person, a group)': 0.12125091329420612, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/sit': 0.7584989029598016, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/smoke': 0.12543466794804858, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/stand': 0.7940518328351375, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/swim': 0.49234393984648445, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/take (an object) from (a person)': 0.043395584350965834, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/take a photo': 0.010481729869303543, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/talk to (e.g., self, a person, a group)': 0.742146023130601, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/text on/look at a cellphone': 0.010950082540092417, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/throw': 0.011885422545550833, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/touch (an object)': 0.28067212821225074, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/turn (e.g., a screwdriver)': 0.009756634157056627, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/walk': 0.6806207798946917, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/watch (a person)': 0.6481009768155408, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/watch (e.g., TV)': 0.2730737534531456, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/work on a computer': 0.01887325654356591, 'PascalBoxes_PerformanceByCategory/AP@0.5IOU/write': 0.03536596249448319, 'PascalBoxes_Precision/mAP@0.5IOU': 0.21620344031595096}

could you give some advice for how to get your default result? thank you!

chaoyuaw commented 5 years ago

Hi @luckcodingdog, Thanks for the questions. The following two discussions might be related: https://github.com/facebookresearch/video-long-term-feature-banks/issues/15#issuecomment-529210371 https://github.com/facebookresearch/video-long-term-feature-banks/issues/23 Could you give these solutions a try and let me know if they fix the problem for you? Thanks!

luckcodingdog commented 5 years ago

thanks for your quick reply. i have uninstall my ffmpeg 4.0 and install ffmpeg 2.8.15, and extract frames again. i found there are several difference between two versions on image size and frames number. for example, there are 27031 frames of 1j20qq1JyX4 in ffmpeg 4.0 and 27032 frames in ffmpeg 2.8.15. Besides, for 1j20qq1JyX4_000001.jpg , it's size is 16.1kb in ffmpeg 4.0 and 19.8 kb in ffmpeg 2.8.15. sadly, they both got the same map 21.6. i will next download the videos again and do as your script.

chaoyuaw commented 5 years ago

Thanks for letting me know. How about the issue described in https://github.com/facebookresearch/video-long-term-feature-banks/issues/23 ? I think it worths checking the "Number of unique boxes", "Number of annotations", etc. printed out in log, and compare them with the numbers in our logs provided in https://github.com/facebookresearch/video-long-term-feature-banks#results

Thanks!

luckcodingdog commented 4 years ago

Hi chaoyuaw, i can't got your default result.

i have compared my log file with your 102760714.log. yours and mine are the same: yours: 2019-03-09 13:33:07.125 [INFO: ava_eval_helper.py: 128]: Evaluating with 50250 unique GT frames. 2019-03-09 13:33:07.125 [INFO: ava_eval_helper.py: 129]: Evaluating with 52391 unique detection frames my: [INFO: ava_eval_helper.py: 149]: Evaluating with 50250 unique GT frames. [INFO: ava_eval_helper.py: 150]: Evaluating with 52391 unique detection frames Besides, i also checked the lines number of 'ava_val_v2.1.csv' is 237140 and the lines number of 'ava_val_predicted_boxes.csv' is 155780. Did the lines number match with yours?

thanks a lot!

chaoyuaw commented 4 years ago

Hi @luckcodingdog,

The number of lines in my files is the same as yours. Are you able to share your full log file? I can try to see if I can identify anything. Also, what's the mAP you got now? Still 21.6?

Thanks!

luckcodingdog commented 4 years ago

Hi @chaoyuaw, thanks for your patience. My map is 21.7 and i have uploaded my log file and some imgs for val in https://github.com/luckcodingdog/LFB-logs

chaoyuaw commented 4 years ago

Hi @luckcodingdog, I haven't got a chance to carefully inspect the log, but with a quick check, I noticed that your frame has more blocky artifacts than mine. Might be worth checking the ffmpeg extraction part (e.g., see if you passed "-r 30 -q:v 1", etc. )

luckcodingdog commented 4 years ago

thanks for your reply. For data preprocessing, i use your script for downloading, cutting and extracting. Besides, my ffmpeg version is 2.8.15. Next, i will put attention to the paper and code. if you have any idea about it, or i solve it , we can communicate with each other.

chaoyuaw commented 4 years ago

Hi @luckcodingdog , I'm wondering if you were able to solve the issue, and if you did, if you are able to share your solution. Thanks!

luckcodingdog commented 4 years ago

Hi @chaoyuaw , i didn't solve the issue, i have no idea about it. If i solve the question, i will share my solution.

chaoyuaw commented 4 years ago

Hi @luckcodingdog , Thanks for sharing the update!

I'll inspect your log carefully again soon and see if I can find anything.

luckcodingdog commented 4 years ago

hi @chaoyuaw, If you encounter a question, use 'cut-ava_videos'script with '.webm' or '.mp4' videos like below: ffmpeg -ss 900 -t 901 -i 0f39OWEqJ24.mp4 0f39OWEqJ24_15min.mp4 the program arise error: [aac @ 0x1793660] The encoder 'aac' is experimental but experimental codecs are not enabled, add '-strict -2' if you want to use it. So i solve it with script ffmpeg -ss 900 -t 901 -i 0f39OWEqJ24.mp4 -strict -2 0f39OWEqJ24_15min.mp4 I think if this makes the last result different with yours ?

chaoyuaw commented 4 years ago

Hi @luckcodingdog ,

Thank you for sharing the information. So after applying the change, how does it affect the mAP you see?

luckcodingdog commented 4 years ago

Hi, @chaoyuaw . After applying the change, the map is 21.7 not 23.2. i wonder if you have encountered the same question?

luckcodingdog commented 4 years ago

Hi, @chaoyuaw , i am sorry to bother you again. If you encounter a question, use 'cut-ava_videos'script with '.mp4' videos like below: ffmpeg -ss 900 -t 901 -i 0f39OWEqJ24.mp4 0f39OWEqJ24_15min.mp4 the program arise error: [aac @ 0x1793660] The encoder 'aac' is experimental but experimental codecs are not enabled, add '-strict -2' if you want to use it. Besides, extract with '.webm', it encounter the error of "Encoder (codec vp8) not found for output stream #0:0". i wonder if these errors make my result is less than yours'. Besides, could you share your conda envs list? i wonder if different version results this? Thanks a lot in advance !

chaoyuaw commented 4 years ago

Hi @luckcodingdog,

With the error [aac @ 0x1793660]..., does it mean that ffmpeg failed to generate 0f39OWEqJ24_15min.mp4? Or did you still get a correct-looking output, just were unsure about the message?

If you try "-strict -2" as suggested by the message, does it make any difference? Namely, ffmpeg -ss 900 -t 901 -i 0f39OWEqJ24.mp4 -strict -2 0f39OWEqJ24_15min.mp4 (I was reading https://stackoverflow.com/questions/32931685/the-encoder-aac-is-experimental-but-experimental-codecs-are-not-enabled)

As for the second error, according to https://superuser.com/questions/976162/troupble-with-ffmpeg-mp4-to-webm it seems that if you download a static build of ffmpeg (from https://ffmpeg.org/download.html) this shouldn't happen. Could you try download a static build to see if it helps if you are not using it already?

If the error results in empty outputs, I think it's likely to affect performance.

luckcodingdog commented 4 years ago

Hi, @chaoyuaw , i have got your default result. The key question is the ffmpeg version. Before i used the ffmpeg which was builded by myself, maybe it can't make the best of ffmpeg. This time i used released version(https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz), the problem is solved. Thanks a lot for your patience !

chaoyuaw commented 4 years ago

Hi @luckcodingdog , That's great! Really glad that you solved the issue, and thanks for letting me know!!

tonysy commented 4 years ago

@luckcodingdog Hi, I meet the same performance drop issue with ffmpeg(version3.1) built by myself and it bothers me many days. Your solution helps me to get the appropriate performance on the SlowFast project. I think the author can add the ffmpeg info into the README, which matters a lot indeed. Thanks again.

luckcodingdog commented 4 years ago

@tonysy keqile!