RubickH / Image-Captioning-with-MAD-and-SAP

Code for paper "Image Captioning with End-to-End Attribute Detection and Subsequent Attributes Prediction". IEEE Transactions on Image Processing 2020
https://ieeexplore.ieee.org/document/8976408
26 stars 4 forks source link

self critical sequence training #2

Open srikanthmalla opened 3 years ago

srikanthmalla commented 3 years ago

Hi @RubickH , I installed all the prereqs and also able to run eval.py (without any issues) using your provided checkpoint. But when I run train with the below command, it stops in the self_critical step after epoch 49 I think. Could you please help me to fix this issue.

------------Command-----------------

python train.py --self_critical_after 50 --id MADSAP0509 --caption_model lstm_MAD_SAP --save_checkpoint_every 500 --batch_size 600 --num_gpu 3 --gpu_id 0,1,2 --beam 0

------------ERROR Output-----------------------

iter 9440 (epoch 49), SAP_loss = 2.289, word_loss = 2.080, MAD_loss = 0.261 time/batch = 2.197 current_lr is 6.7108864e-05 Read and process data: 1.49842405319 iter 9441 (epoch 49), SAP_loss = 2.213, word_loss = 2.030, MAD_loss = 0.252 time/batch = 2.823 initializing CIDEr scorer... initlizing CIDEr scorers in 0.000029s current_lr is 5.36870912e-05 Read and process data: 0.961675882339 GPU time is : 1.09868407249s Traceback (most recent call last): File "train.py", line 399, in train(opt) File "train.py", line 244, in train reward = get_self_critical_reward(gen_result,greedy_res,data['gts']) File "train.py", line 72, in get_self_criticalreward , cider_scores = CiderD_scorer.computescore(gts, res) File "cider-master/pyciderevalcap/ciderD/ciderD.py", line 48, in compute_score (score, scores) = cider_scorer.compute_score(self._df) File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 199, in compute_score score = self.compute_cider(df_mode) File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 173, in compute_cider vec, norm, length = counts2vec(test) File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 122, in counts2vec df = np.log(max(1.0, self.document_frequency[ngram])) KeyError: ('7', '9459') Terminating BlobFetcher

Thank you, Srikanth

RubickH commented 3 years ago

Did you generate the cached tokens before running scst? You can refer to more details about scst in Ruotian Luo's webpage.

srikanthmalla commented 3 years ago

Hi @RubickH , I generated the cache using python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk_attr.json --output_pkl data/coco-train-new --split train, but the error is same. I am guessing it is an issue with your integration with Cider-D. You might have changed something in Cider-D or to it's input args but forgot to commit that. Could you please check if your local version is same and the git version is updated to work with Self-critical-sequence-training (as it is also part of your paper).

Thank you, Srikanth

RubickH commented 3 years ago

My modifications are adding two loss functions but not changing the original SCST loss. Are you using python 3.X? The python version may matter. You can try computing the Cider-D score with the 'corpus' mode to verify whether the error comes from the cached token.

srikanthmalla commented 3 years ago

You can try computing the Cider-D score with the 'corpus' mode to verify whether the error comes from the cached token.

@RubickH , looks like this works. But how do I fix the cached tokens issue, because I am just following your readme and new to this repo.

Also what is the difference between corpus and cached token. Does the corpus mode create groundtruth n-grams for the mini-batch on-the-fly, while the cached token is pre-computed for whole train-set?

Best Regards, Srikanth

srikanthmalla commented 3 years ago

Hi @RubickH, I am able to fix it using ruotian luo's repo: https://github.com/ruotianluo/self-critical.pytorch/blob/master/scripts/prepro_ngrams.py

I thought I made mistake by not cloning your repo recursively (for correct versions of cider-master). So I tried again with recursive cloning and rerunning the cached token, but looks like your processed cached token (with your script) is having issue. You can close this issue as it works with ruotian's preprocessing for ngram.