ruotianluo / ImageCaptioning.pytorch

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)
MIT License
1.43k stars 412 forks source link

I met a error in the training... #33

Open weiweili123 opened 6 years ago

weiweili123 commented 6 years ago

xw@xw:~/ImageCaptioning.pytorch-master$ python train.py --id st --caption_model show_tell --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_st --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 25 ... evaluating validation preformance... 4989/5000 (2.672655) image 324313: a man is sitting on a bed with a laptop image 46616: a man is riding a skateboard on a ramp image 285832: a living room with a couch and a table image 496718: a man is holding a cell phone while standing in a park image 398209: a living room with a couch and a table image 568041: a living room with a couch and a table image 206596: a man is playing tennis on a tennis court image 451949: a man is holding a skateboard in a park image 203138: a man in a suit and tie is holding a cell phone image 296759: a close up of a person holding a hot dog evaluating validation preformance... -1/5000 (2.669259) Traceback (most recent call last): File "train.py", line 204, in train(opt) File "train.py", line 152, in train for k,v in lang_stats.items(): AttributeError: 'NoneType' object has no attribute 'items' Terminating BlobFetcher

ruotianluo commented 6 years ago

This is because I didn't consider situation when language_eval is false. You can comment this line put.

weiweili123 commented 6 years ago

Thanks! It's work

makaaay commented 6 years ago

sorry. I met the same question ,but I don't know how to fix it

ruotianluo commented 6 years ago

@makaaay check my reply above.

makaaay commented 6 years ago

emmmm,comment " assert args.language_eval == 0 or args.language_eval == 1" in opts.py ? sorry,I am a new learner in Shanghai.

ruotianluo commented 6 years ago

comment out line 152 and line 153 in train.py

makaaay commented 6 years ago

yeah~it works now ! my mate help me in lines150-161. if tf is not None and isinstance(lang_stats, dict): add_summary_value(tf_summary_writer, 'validation loss', val_loss, iteration) for k,v in lang_stats.items(): add_summary_value(tf_summary_writer, k, v, iteration) tf_summary_writer.flush() val_result_history[iteration] = {'loss': val_loss, 'lang_stats': lang_stats, 'predictions': predictions}

        # Save model if is improving on validation result
        if opt.language_eval == 1 and isinstance(lang_stats, dict):
            current_score = lang_stats['CIDEr']
        else:
            current_score = - val_loss

it also works!

thank you very much ! happy new year in US and have a bright future in phd.

TalenWang-AIcv commented 5 years ago

@weiweili123 Hi, It seems that you have successfully run @ruotianluo 's code, but I got such an Error as followed, can you give me some help or advice? Thank you very much.

DataLoader loading json file: ./data/cocotalk.json vocab size is 9487 DataLoader loading h5 file: ./data/cocotalk_fc ./data/cocotalk_att ./data/cocotalk_label.h5 max sequence length in data is 20 read 123287 image features assigned 113287 images to split train assigned 5000 images to split val assigned 5000 images to split test F:\Miniconda\lib\site-packages\torch\nn\modules\rnn.py:54: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.5 and num_layers=1 "num_layers={}".format(dropout, num_layers)) Traceback (most recent call last): File "F:/Research/ImageCaptioning.pytorch-master/train.py", line 211, in train(opt) File "F:/Research/ImageCaptioning.pytorch-master/train.py", line 105, in train data = loader.get_batch('train') File "F:\Research\ImageCaptioning.pytorch-master\dataloader.py", line 135, in get_batch ix, tmp_wrapped = self._prefetch_process[split].get() File "F:\Research\ImageCaptioning.pytorch-master\dataloader.py", line 263, in get self.reset() File "F:\Research\ImageCaptioning.pytorch-master\dataloader.py", line 242, in reset collate_fn=lambda x: x[0])) File "F:\Miniconda\lib\site-packages\torch\utils\data\dataloader.py", line 193, in iter return _DataLoaderIter(self) File "F:\Miniconda\lib\site-packages\torch\utils\data\dataloader.py", line 469, in init w.start() File "F:\Miniconda\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "F:\Miniconda\lib\multiprocessing\context.py", line 223, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "F:\Miniconda\lib\multiprocessing\context.py", line 322, in _Popen return Popen(process_obj) File "F:\Miniconda\lib\multiprocessing\popen_spawn_win32.py", line 65, in init reduction.dump(process_obj, to_child) File "F:\Miniconda\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) AttributeError: Can't pickle local object 'BlobFetcher.reset..' Terminating BlobFetcher Traceback (most recent call last): File "", line 1, in File "F:\Miniconda\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "F:\Miniconda\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input <