OpenGVLab VideoMAEv2 issues

OpenGVLab / VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

https://arxiv.org/abs/2303.16727

MIT License

524 stars 63 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

I find that there seems to be some strange things in the evaluation of model.

#23 leexinhao closed 1 year ago
7
Do you have the finetuned checkpoints for UCF101?

#22 yerx closed 1 year ago
2
The hyperparameter Settings in the script seem to be inconsistent with those in the paper

#21 leexinhao closed 1 year ago
4
(Feature request) Batched feature extraction

#20 christian-matroid closed 8 months ago
18
[Doc] Release TAD Features

#19 congee524 closed 1 year ago
1
Wonder more pretrain scripts and results

#18 LinB203 closed 1 year ago
2
[Doc] update bibtex of cvpr

#17 congee524 closed 1 year ago
0
[Feature] Support pretraining with PyTorch 2.0

#16 congee524 closed 1 year ago
0
你好！再向你请教一个问题，就是我把部分模块冻结不更新参数的时候，跑的V2版本的vit_b_k400_ft.sh，batch size设置为4的时候一个epoch训练时间为1小时20分钟，batch size设置为8的时候一个epoch训练时间也为1小时左右，batch size设置为32的时候一个epoch训练时间也为1小时左右，请问这是正常现象么，就是batch size增大4倍的时候，每一个step时间也会增大四倍，然后一个epoch的总时间就不怎么变化，但无论batch size是4，8，还是32，GPU利用率好像都是满的（GPU-Util Compute M.这一列），请问我这里成倍数增加batch size而不能成倍数减少训练时间是正常的吗，目前batch size为4和8都能完整训练十个epoch，但是为32的时候会报错RuntimeError: DataLoader worker (pid 34621) is killed by signal: Killed.

#15 DragonWang-cell closed 1 year ago
7
你好！我跑的V2版本的vit_b_k400_ft.sh，最终测试final_test需要20个小时，如下面所示，然后我又跑VideoMAE的final_test，发现也差不多那么久，但是我记得之前跑测试就俩小时左右啊，这是怎么回事啊，是我记错了么，修改了一下午v2版本的然后还是这样，突然找不到原因了

#14 DragonWang-cell closed 1 year ago
2
你好！请问可以提供ViT-base蒸馏模型finetune的script或者提供ViT-base的普通模型吗？非常感谢！！！我的邮箱是2256380854@qq.com

#13 DragonWang-cell closed 1 year ago
9
[Fix] Bug due to deprecation

#12 congee524 closed 1 year ago
0
[Doc] Testing Support at MMAction2

#11 congee524 closed 1 year ago
0
Hello! Can you publish the basic model of ViT-base? Or is it just a ViT-base distillation model?

#10 DragonWang-cell closed 1 year ago
1
[Doc] fix_typo_in_large_pretrain_script

#9 congee524 closed 1 year ago
0
why videomaev2-large don't need repeat videos? is it type error?

#8 LinB203 closed 1 year ago
1
Question about multiple view fusion?

#7 Xiaolong-RRL closed 1 year ago
2
[Feature] add tad feature extract script

#6 congee524 closed 1 year ago
0
The feature details of THUMOS14 dataset.

#5 dingfengshi closed 1 year ago
2
[Doc] update finaction badge

#4 congee524 closed 1 year ago
0
could you please provide me the weight of VideoMAEv2 pre-trained on Kinetics-400?

#3 Value-Jack closed 1 year ago
2
[Doc] release giant model weights

#2 congee524 closed 1 year ago
0
Reproducing of TAD

#1 gyusik19 closed 1 year ago
6