xiadingZ / video-caption.pytorch

pytorch implementation of video captioning
MIT License
401 stars 130 forks source link

How to evaluate Meteor value? #7

Closed VisheshTanwar-IITR closed 6 years ago

VisheshTanwar-IITR commented 6 years ago

Hey

I have trained the model on my own dataset (Video + captions), and I have received three files namely, model_5.pth, model_score.txt and opt_info.json.

How can I evaluate the Meteor, Blue etc for my training data.? Also, train.py only train the model, how can I divide data into valid and train as we do in keras?

Thanks

xiadingZ commented 6 years ago

use eval.py, for options you can see my example code and source code. opt_info.json has options to rebuild model and model_5.pth is saved model

VisheshTanwar-IITR commented 6 years ago

@xiadingZ

I ran the code "python eval.py --recover_opt data/save7/opt_info.json --saved_model data/save7/model_300.pth --batch_size 10" BUT I got an output with error as:

vocab size is 71070 number of train videos: 3500 number of val videos: 0 number of test videos: 0 load feats from data/feats/incep_v4 max sequence length in data is 28 init COCO-EVAL scorer Traceback (most recent call last): File "eval.py", line 117, in main(opt) File "eval.py", line 92, in main test(model, crit, dataset, dataset.get_vocab(), opt) File "eval.py", line 38, in test for data in loader: File "/home/vishesh/anaconda2/envs/video_caption_p35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 258, in next indices = next(self.sample_iter) # may raise StopIteration File "/home/vishesh/anaconda2/envs/video_caption_p35/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/vishesh/anaconda2/envs/video_caption_p35/lib/python3.5/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long())

RuntimeError: invalid argument 1: must be strictly positive at /home/vishesh/Pictures/video_caption/pytorch/aten/src/TH/generic/THTensorMath.c:2822

Please Help..

xiadingZ commented 6 years ago

do you change options to fit your case? can you show the command you run and files path which you get in train.py

VisheshTanwar-IITR commented 6 years ago

python train.py --epochs 9001 --batch_size 100 --checkpoint_path data/save7 --feats_dir data/feats/incep_v4 --rnn_dropout_p 0.1 --dim_hidden 1024 --dim_word 512 --dim_vid 1536 --model S2VTAttModel

This is opt_info.json:

{"rnn_dropout_p": 0.1, "input_dropout_p": 0.2, "dim_vid": 1536, "save_checkpoint_every": 50, "input_json": "data/sentences.json", "cached_tokens": "msr-all-idxs", "model": "S2VTAttModel", "epochs": 9001, "optim_alpha": 0.9, "caption_json": "data/caption.json", "weight_decay": 0.0005, "gpu": "0", "dim_hidden": 1024, "learning_rate_decay_every": 200, "learning_rate": 0.0004, "self_crit_after": -1, "num_layers": 2, "learning_rate_decay_rate": 0.8, "bidirectional": false, "grad_clip": 0.1, "batch_size": 100, "feats_dir": "data/feats/incep_v4", "checkpoint_path": "data/save7", "optim_beta": 0.999, "dim_word": 512, "info_json": "data/info.json", "optim_epsilon": 1e-08, "beam_size": 2, "max_len": 28}

xiadingZ commented 6 years ago

this seems to be the pytorch's issue, can you compile is from latest master? now I can't access my server to debug. the log shows the bug occurs in for data in loader. I guess it's because dataloader's API has changed, but I use the same way to use dataloader in train.py and that success, it's strange.

VisheshTanwar-IITR commented 6 years ago

Yess I installed pytorch from latest source master. Can you plz share your pytorch source master with me?

xiadingZ commented 6 years ago

all my code is in server and I can't access server these days, can you fix the bug and commit a pull request? I think the bug is due to dataloader, you can see dataloader.py and official document

VisheshTanwar-IITR commented 6 years ago

I am using Python 3.5.. Okay I try

Thanks

xiadingZ commented 6 years ago

I have changed the code to fit pytorch 0.3.1, which can install using conda or pip, and runs with no bug on my server.

SheldonTsui commented 6 years ago

@xiadingZ I have met the same question when testing. I ran the command: 'python eval.py --recover_opt data/save7/opt_info.json --saved_model data/save7/model_1600.pth --batch_size 40 --gpu 0,1,2,3,4,5,6,7' But I get the error as follows:

vocab size is 16860 number of train videos: 10000 number of val videos: 0 number of test videos: 0 load feats from data/feats/incepv4 max sequence length in data is 28 init COCO-EVAL scorer Traceback (most recent call last): File "eval.py", line 122, in main(opt) File "eval.py", line 91, in main test(model, crit, dataset, dataset.get_vocab(), opt) File "eval.py", line 38, in test for data in loader: File "/home/xuxudong/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 258, in next indices = next(self.sample_iter) # may raise StopIteration File "/home/xuxudong/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/xuxudong/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long()) RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/TH/generic/THTensorMath.c:2247

Why is the number of test videos ZERO?

xiadingZ commented 6 years ago

I have no such problem. show me your pytorch version and command you have runned

SheldonTsui commented 6 years ago

0.3.1.post2

xiadingZ commented 6 years ago

I have updated readme just. you can try again

lixiangpengcs commented 6 years ago

I met the same error as @SheldonTsui vocab size is 7278 number of train videos: 6513 number of val videos: 497 number of test videos: 0 load feats from ./resnet152/ max sequence length in data is 28 init COCO-EVAL scorer Traceback (most recent call last): File "eval.py", line 122, in main(opt) File "eval.py", line 91, in main test(model, crit, dataset, dataset.get_vocab(), opt) File "eval.py", line 38, in test for data in loader: File "/home/env/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 187, in next indices = next(self.sample_iter) # may raise StopIteration File "/home/lixiangpeng/env/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 119, in iter for idx in self.sampler: File "/home/env/anaconda3/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 50, in iter return iter(torch.randperm(len(self.data_source)).long()) RuntimeError: invalid argument 1: must be strictly positive at /opt/conda/conda-bld/pytorch_1513368888240/work/torch/lib/TH/generic/THTensorMath.c:2184

xiadingZ commented 6 years ago

I'm using 2017 train dateset to split to train/val/test(8000/1000/1000), haves stated this in README. But you doesn't use my dateset. Or do I upload a wrong train dateset? you can have a look in my code to see how I process dateset

lixiangpengcs commented 6 years ago

OK, I use the msr-vtt dataset that I downloaded previously, which is split as 6513/497/2990 for train, valid and test. I thought it will not result in errors. I will use your dataset this time, but i do think that different splits of dataset should not effect this code running.

I know that the question is that you didn't split your train set into three parts in your code.