microsoft / VideoX

VideoX: a collection of video cross-modal models
Other
978 stars 161 forks source link

About Activitynet dataset #24

Closed MinamiKotoka closed 3 years ago

MinamiKotoka commented 4 years ago

Hi, as #9 said, I could download extracted features of activitynet. But I want know how to use the file pac_activitynet_v1-3.hdf5 beacuse I want use this dataset with Charades-STA's model. Thank you very much if you can help me.

Sy-Zhang commented 4 years ago

Hi, as #9 said, I could download extracted features of activitynet. But I want know how to use the file pac_activitynet_v1-3.hdf5 beacuse I want use this dataset with Charades-STA's model. Thank you very much if you can help me.

The Charades-STA's model cannot be evaluated on ActivityNet Captions, since we use different features (VGG v.s. C3D) on these two datasets. You can extract features on these two datasets with the same backbone and train a model with our code.

MinamiKotoka commented 4 years ago

Hi, as #9 said, I could download extracted features of activitynet. But I want know how to use the file pac_activitynet_v1-3.hdf5 beacuse I want use this dataset with Charades-STA's model. Thank you very much if you can help me.

The Charades-STA's model cannot be evaluated on ActivityNet Captions, since we use different features (VGG v.s. C3D) on these two datasets. You can extract features on these two datasets with the same backbone and train a model with our code.

Thanks, what I want to do is proving Charades-STA's model does not work on activitynet dataset and get the exact difference between on activitynet's model. So I want test Charades-STA dataset on activitynet's model or other tests like this. But the dimesion of hidden layers of activitynet model is 4096, but the dimesion of features of Charades-STA dataset is 500. When I check http://activity-net.org/challenges/2016/download.html I found that We reduce the dimensionality of the activations from the second fully-connected layer (fc7) of our visual encoder from 4096 to 500 dimensions using PCA. So I want to know how to use the file pac_activitynet_v1-3.hdf5 and reduce the dimension from 4096 to 500 of activitynet dataset. Do you have some suggestions? Thank you very much.

Sy-Zhang commented 4 years ago

Hi, as #9 said, I could download extracted features of activitynet. But I want know how to use the file pac_activitynet_v1-3.hdf5 beacuse I want use this dataset with Charades-STA's model. Thank you very much if you can help me.

The Charades-STA's model cannot be evaluated on ActivityNet Captions, since we use different features (VGG v.s. C3D) on these two datasets. You can extract features on these two datasets with the same backbone and train a model with our code.

Thanks, what I want to do is proving Charades-STA's model does not work on activitynet dataset and get the exact difference between on activitynet's model. So I want test Charades-STA dataset on activitynet's model or other tests like this. But the dimesion of hidden layers of activitynet model is 4096, but the dimesion of features of Charades-STA dataset is 500. When I check http://activity-net.org/challenges/2016/download.html I found that We reduce the dimensionality of the activations from the second fully-connected layer (fc7) of our visual encoder from 4096 to 500 dimensions using PCA. So I want to know how to use the file pac_activitynet_v1-3.hdf5 and reduce the dimension from 4096 to 500 of activitynet dataset. Do you have some suggestions? Thank you very much.

You can refer this for reversing PCA. However, even if the dimensions are same, you cannot use our pretrained model to compare them since the input features come from different backbone (C3D v.s. VGG). I suggest you train our model with C3D features extracted here and then compare them. It seems like they use the same settings to extract C3D features on both datasets.

MinamiKotoka commented 4 years ago

Hi, as #9 said, I could download extracted features of activitynet. But I want know how to use the file pac_activitynet_v1-3.hdf5 beacuse I want use this dataset with Charades-STA's model. Thank you very much if you can help me.

The Charades-STA's model cannot be evaluated on ActivityNet Captions, since we use different features (VGG v.s. C3D) on these two datasets. You can extract features on these two datasets with the same backbone and train a model with our code.

Thanks, what I want to do is proving Charades-STA's model does not work on activitynet dataset and get the exact difference between on activitynet's model. So I want test Charades-STA dataset on activitynet's model or other tests like this. But the dimesion of hidden layers of activitynet model is 4096, but the dimesion of features of Charades-STA dataset is 500. When I check http://activity-net.org/challenges/2016/download.html I found that We reduce the dimensionality of the activations from the second fully-connected layer (fc7) of our visual encoder from 4096 to 500 dimensions using PCA. So I want to know how to use the file pac_activitynet_v1-3.hdf5 and reduce the dimension from 4096 to 500 of activitynet dataset. Do you have some suggestions? Thank you very much.

You can refer this for reversing PCA. However, even if the dimensions are same, you cannot use our pretrained model to compare them since the input features come from different backbone (C3D v.s. VGG). I suggest you train our model with C3D features extracted here and then compare them. It seems like they use the same settings to extract C3D features on both datasets.

Thank you for your suggestions! I'll try to use C3D to extract Charades-STA features. I'm very grateful to you!

onlyonewater commented 4 years ago

Hi, @MinamiKotoka , I think the activitynet's model can't use in charades's dataset. In 2d-tan, charades and activitynet don't have the same network structure, the num_clips in charades is 16, but in activitynet is 64, so, even the hidden_dim of extracted features are same.