Closed VisheshTanwar-IITR closed 6 years ago
use eval.py
, for options you can see my example code and source code. opt_info.json
has options to rebuild model and model_5.pth
is saved model
@xiadingZ
I ran the code "python eval.py --recover_opt data/save7/opt_info.json --saved_model data/save7/model_300.pth --batch_size 10" BUT I got an output with error as:
vocab size is 71070
number of train videos: 3500
number of val videos: 0
number of test videos: 0
load feats from data/feats/incep_v4
max sequence length in data is 28
init COCO-EVAL scorer
Traceback (most recent call last):
File "eval.py", line 117, in
RuntimeError: invalid argument 1: must be strictly positive at /home/vishesh/Pictures/video_caption/pytorch/aten/src/TH/generic/THTensorMath.c:2822
Please Help..
do you change options to fit your case? can you show the command you run and files path which you get in train.py
python train.py --epochs 9001 --batch_size 100 --checkpoint_path data/save7 --feats_dir data/feats/incep_v4 --rnn_dropout_p 0.1 --dim_hidden 1024 --dim_word 512 --dim_vid 1536 --model S2VTAttModel
This is opt_info.json:
{"rnn_dropout_p": 0.1, "input_dropout_p": 0.2, "dim_vid": 1536, "save_checkpoint_every": 50, "input_json": "data/sentences.json", "cached_tokens": "msr-all-idxs", "model": "S2VTAttModel", "epochs": 9001, "optim_alpha": 0.9, "caption_json": "data/caption.json", "weight_decay": 0.0005, "gpu": "0", "dim_hidden": 1024, "learning_rate_decay_every": 200, "learning_rate": 0.0004, "self_crit_after": -1, "num_layers": 2, "learning_rate_decay_rate": 0.8, "bidirectional": false, "grad_clip": 0.1, "batch_size": 100, "feats_dir": "data/feats/incep_v4", "checkpoint_path": "data/save7", "optim_beta": 0.999, "dim_word": 512, "info_json": "data/info.json", "optim_epsilon": 1e-08, "beam_size": 2, "max_len": 28}
this seems to be the pytorch's issue, can you compile is from latest master? now I can't access my server to debug. the log shows the bug occurs in for data in loader
. I guess it's because dataloader's API has changed, but I use the same way to use dataloader in train.py and that success, it's strange.
Yess I installed pytorch from latest source master. Can you plz share your pytorch source master with me?
all my code is in server and I can't access server these days, can you fix the bug and commit a pull request? I think the bug is due to dataloader, you can see dataloader.py
and official document
I am using Python 3.5.. Okay I try
Thanks
I have changed the code to fit pytorch 0.3.1, which can install using conda or pip, and runs with no bug on my server.
@xiadingZ I have met the same question when testing. I ran the command: 'python eval.py --recover_opt data/save7/opt_info.json --saved_model data/save7/model_1600.pth --batch_size 40 --gpu 0,1,2,3,4,5,6,7' But I get the error as follows:
vocab size is 16860
number of train videos: 10000
number of val videos: 0
number of test videos: 0
load feats from data/feats/incepv4
max sequence length in data is 28
init COCO-EVAL scorer
Traceback (most recent call last):
File "eval.py", line 122, in
Why is the number of test videos ZERO?
I have no such problem. show me your pytorch version and command you have runned
0.3.1.post2
I have updated readme just. you can try again
I met the same error as @SheldonTsui
vocab size is 7278
number of train videos: 6513
number of val videos: 497
number of test videos: 0
load feats from ./resnet152/
max sequence length in data is 28
init COCO-EVAL scorer
Traceback (most recent call last):
File "eval.py", line 122, in
I'm using 2017 train dateset to split to train/val/test(8000/1000/1000), haves stated this in README. But you doesn't use my dateset. Or do I upload a wrong train dateset? you can have a look in my code to see how I process dateset
OK, I use the msr-vtt dataset that I downloaded previously, which is split as 6513/497/2990 for train, valid and test. I thought it will not result in errors. I will use your dataset this time, but i do think that different splits of dataset should not effect this code running.
I know that the question is that you didn't split your train set into three parts in your code.
Hey
I have trained the model on my own dataset (Video + captions), and I have received three files namely, model_5.pth, model_score.txt and opt_info.json.
How can I evaluate the Meteor, Blue etc for my training data.? Also, train.py only train the model, how can I divide data into valid and train as we do in keras?
Thanks