-
@titu1994 Thank you for your code! I want to use the non-local resnet model for video classification, as in the paper. The author emphasizes that the convolution operation should be 3-d convolution, b…
-
is there any way to add a choice inside one video tracking? like, in one video tracking, 0-3s one object is sitting, 3-5s he is standing, with NO change to tracking ID. Can i add classification to thi…
-
-
本文提供了利用frame-level 信息的两种思路:feature-pooling 以及 recurrent neutral networks,前者没有利用temporal 信息,后者则利用了temporal 信息。
(3d-convolution尝试的必要不大,Karpathy 指出这种结构just marginally better than single frame baseline.)…
-
https://arxiv.org/abs/1706.06905
-
### To-Do
- [x] Tag classification test _**with one image**_ per video (This image is from BDD100K images dataset)
- [ ] Tag classification test _**with multiple images**_ per video (These images ar…
-
Thank you for your work on multimodal prompt learning for missing modalities.
I have a video dataset which is not for sentiment analysis or emotion recognition but I want to use your architecture f…
gak97 updated
3 months ago
-
I have a question regarding the video preprocessing step for simba. Should videos be preprocessed as shown in the simba tutorial, then run through SLEAP, and then also be used in simba? In other words…
-
### Search before asking
- [X] I have searched the HUB [issues](https://github.com/ultralytics/hub/issues) and [discussions](https://github.com/ultralytics/hub/discussions) and found no similar quest…
-
The current gender classification task is not operating in real-time, which limits its applicability in scenarios that require immediate feedback. This lack of real-time processing can be frustrating,…