Open typ1012 opened 1 year ago
Hi, the features we release are from the VideoMAE, and the results you reproduce are correct. Concatenating with UniformerV2 features will get a better performance shown in the paper, and the features from UniformerV2 will be released soon.
Hi @typ1012 @Richard-61 I tried to run Thumos14, but the map is only 47 (I merely changed the input_dim from 1408 to 1280). I'm wondering whether you have other modifications in order to reproduce the 69.11 mAP.
Hello, thanks for your good work! I encountered some problems when reproducing the performance of Temporal-Action-Localization task: Thumos14: 69.11 average mAP (lower than 71.58). The input_dim of feature is not right. (The channel dimensionof provided mae feature is 1280 but 1408 in thumos.yaml) Anet1.3: 38.56 average mAP.
Hi,bro,I also want to know how do you reproduce the 69.11mAP,I only could get bad show
@tensorboy @typ1012 have you solve your problem?
@tensorboy @typ1012 have you solve your problem?
I fixed the code for the bug of batch_nms.
so? do you reproduced the result 71.58?
from petrel_client.client import Client ModuleNotFoundError: No module named 'petrel_client',could you please tell me how to import this module?
That is a module to load videos on our servers. It may not be applicable in your case. You can remove it and update the corresponding video loading functions. We will fix it soon. @Richard-61
could you please explain the code in /InternVideo-main/Downstream/Temporal-Action-Localization/configs/thumos.yaml
I don't know the meaning of the 1408 input_dim: 1408,
Hello, thanks for your good work! I encountered some problems when reproducing the performance of Temporal-Action-Localization task: Thumos14: 69.11 average mAP (lower than 71.58). The input_dim of feature is not right. (The channel dimensionof provided mae feature is 1280 but 1408 in thumos.yaml) Anet1.3: 38.56 average mAP.
Hello @typ1012, I have been trying to reproduce the Anet1.3 scores as you mentioned you did, but have not been able to get better than 32.23. I have to use my own clone of the ActionFormer repository to accomplish this, as the InternVideo's downstream copy of the repository has many issues. Could you share the steps you used to produce this reported Anet1.3 score? Thank you!
Hello, thanks for your good work! I encountered some problems when reproducing the performance of Temporal-Action-Localization task: Thumos14: 69.11 average mAP (lower than 71.58). The input_dim of feature is not right. (The channel dimensionof provided mae feature is 1280 but 1408 in thumos.yaml) Anet1.3: 38.56 average mAP.