Closed EntilZha closed 2 years ago
Hi,
First of all that model is not trained for the 1K split. That model is trained for the full split which has 7k videos in training and 3k videos in testing. So, that specific model will give lower numbers when tested on the 1kA split. For testing on the 1kA split you need to have different configurations in the config file. So, you will need to change the gtp2-xl-finetuned-adam config file with the modification specified in jsfusion config file. The final config should contain the followings:
"eval_mode": "fixed_num_epochs",
"data_loader": {
"type": "ExpertDataLoader",
"args":{
"dataset_name": "MSRVTT",
"split_name": "jsfusion"
}
},
"eval_settings": false
Cheers!
Thanks for the quick response, appreciate it!
On the model, to clarify I grabbed the model inked under this section https://github.com/albanie/collaborative-experts#cvpr-2020-pentathlon-challenge, specifically the row CE/1K-A/t2v task, which is this one http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/models/msrvtt-train-jsfusion-ce/2b66fed2/seed-0/2020-01-07_15-30-39/trained_model.pth. Is this the correct model to use?
The first thing I tried was using the corresponding config file http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/models/msrvtt-train-jsfusion-ce/2b66fed2/seed-0/2020-01-07_15-30-39/config.json
Initially, I ran python test.py --config data/1KA/config.json --resume data/1KA/trained_model.pth --eval_from_training_config
which gave this error:
/PREFIX/code/collaborative-experts
log config: logger/logger_config.json exists: True
Traceback (most recent call last):
File "test.py", line 231, in <module>
evaluation(eval_config)
File "test.py", line 114, in evaluation
merge(eval_conf._config, config["eval_settings"], strategy=Strategy.REPLACE)
File "/PREFIX/miniconda3/envs/fast/lib/python3.7/site-packages/mergedeep/mergedeep.py", line 100, in merge
return reduce(partial(_deepmerge, strategy=strategy), sources, destination)
File "/PREFIX/miniconda3/envs/fast/lib/python3.7/site-packages/mergedeep/mergedeep.py", line 75, in _deepmerge
for key in src:
TypeError: 'bool' object is not iterable
Which I debugged a bit and figured was that eval_settings
is expected to be iterable, hence changing it to {}
. Does that seem correct? Once that is fixed, I get the error I mentioned above. Am I correct in thinking that this is the model I'd want to use to reproduce the paper results for the 1K split?
If I understand your suggestion correctly, what I'd actually want to do is use the config file you mentioned and modify it with the eval settings. The model that refers to seems to be the main table under "full", would this model give lower numbers due to training on 7K or give the paper numbers? If it gives lower, is there a way to fix the prior error? Thanks!
Hi,
For the 1k-A split, the model is trained for a fixed number of epochs and for evaluation you don't need to specify the --eval_from_training_config
argument (for 1k-A the testing step is equivalent to the validation step, so that's why the argument should not be specified). The --eval_from_training_config
argument is needed for all the other datasets and splits, except for 1k-A.
So, if you use this model http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/models/msrvtt-train-jsfusion-ce/2b66fed2/seed-0/2020-01-07_15-30-39/trained_model.pth and this config http://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/models/msrvtt-train-jsfusion-ce/2b66fed2/seed-0/2020-01-07_15-30-39/config.json with the command
python test.py --config data/1KA/config.json --resume data/1KA/trained_model.pth
you will get the results presented under https://github.com/albanie/collaborative-experts#cvpr-2020-pentathlon-challenge.
The gpt2-xl model is trained on the full split, so it will produce lower results than a model trained on the 1k-A split (still it should produce higher numbers than the model under https://github.com/albanie/collaborative-experts#cvpr-2020-pentathlon-challenge). You can still evaluate it, modifying the config as specified in my previous comment (without the --eval_from_training_config
argument).
Cheers, Ioana
Thanks for clarifying, I'm still having issues running the model due to the missing file error from above, which I also put below. Any thoughts on what to download to fix that?
$ python test.py --config data/1KA/config.json --resume data/1KA/trained_model.pth
FileNotFoundError: [Errno 2] No such file or directory: 'data/MSRVTT/high-quality/structured-symlinks/aggregated_i3d_25fps_256px_stride25_offset0_inner_stride1/i3d-avg.pickle'
Hi,
What features have you downloaded? If you want to test this model https://www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/models/msrvtt-train-jsfusion-ce/2b66fed2/seed-0/2020-01-07_15-30-39/trained_model.pth, then you should download the following features: http:/www.robots.ox.ac.uk/~vgg/research/collaborative-experts/data/features-v2/MSRVTT-experts.tar.gz If you are interested in gpt2-xl models, then you should use other configs (which do not require i3d features).
Cheers, Ioana
Hi,
I will close this issue. Feel free to open it if necessary!
Hi, I'm working on reproducing the MSR-VTT 1K results and running into this error. Is there a command I need to issue in order to download the missing data? I tried both manually downloading the data and using
python misc/sync_experts.py --dataset MSRVTT
The model and config are both from the table in the readme corresponding to MSR-VTT 1KA results, with the config being this https://www.robots.ox.ac.uk/~vgg/research/teachtext/data-hq/models/msrvtt-train-gpt2-xl-finetuned-adam/244af891/seed-0/2020-10-01_12-22-00/config.json JSON config after changing
eval_settings: false
toeval_settings: {}
to avoid an error in merging configuration files. Thanks!