StevRamos / video_summarization

A computing solution based on deep learning that allows the efficient generation of keyshot type spotlights from videos.
MIT License
20 stars 4 forks source link

What does rate mean in the shapes of features_rgb, features_flow and features_3D? #6

Open Mirli1234 opened 10 months ago

Mirli1234 commented 10 months ago

Hi Stev,

Your work on summarizing videos using different features was very helpful to me. I was inspired a lot by your work and now I have some doubts about the feature extraction process.

What does rate mean in the shapes of features_rgb, features_flow and features_3D?https://github.com/StevRamos/video_summarization/blob/051632fd9e5ad94dd4a2b2bb31ea928f7269c1ac/README.md?plain=1#L86

I hope you can answer this question for me. Thank you so much!

StevRamos commented 10 months ago

Hey Jake! It's cool this work is helping you. I left this project for a while for work reasons but I'll continue to work on it this year so I hope I can get better results

To answer your question, I'm not really sure why I set the rate there. All features should have the same number of subsampled frames, but I guess that there was another rate for them. Like how many subframes by each frame to extract. Unfortunately I'm away from my computer now and I can't check the actual shape of those features. It might be a mistake I made when I was documenting the readme file. Let me know if that's the actual shape of those features, if they are then I'd need to go to my laptop and try to investigate again. Sorry for not remembering this correctly!

Mirli1234 commented 10 months ago

Hey Jake! It's cool this work is helping you. I left this project for a while for work reasons but I'll continue to work on it this year so I hope I can get better results

To answer your question, I'm not really sure why I set the rate there. All features should have the same number of subsampled frames, but I guess that there was another rate for them. Like how many subframes by each frame to extract. Unfortunately I'm away from my computer now and I can't check the actual shape of those features. It might be a mistake I made when I was documenting the readme file. Let me know if that's the actual shape of those features, if they are then I'd need to go to my laptop and try to investigate again. Sorry for not remembering this correctly!

Thank you very much for your reply and for sharing your thoughts on feature extraction. This has been very helpful to me in correctly extracting features using different pre-trained models.

I have another question about the difference in results you obtained with the KTS algorithm for shot segmentation compared to Kaiyang Zhou. I personally think this might be due to you using down-sampled frame features for shot segmentation, while the original paper seems to use features that have not been down-sampled. Thanks again!

StevRamos commented 10 months ago

thank you Jake! that makes sense to me. I will work again on this project during this year, so happy to know these suggestions!