Open srikanthmalla opened 3 years ago
Did you generate the cached tokens before running scst? You can refer to more details about scst in Ruotian Luo's webpage.
Hi @RubickH ,
I generated the cache using python scripts/prepro_ngrams.py --input_json data/dataset_coco.json --dict_json data/cocotalk_attr.json --output_pkl data/coco-train-new --split train
, but the error is same. I am guessing it is an issue with your integration with Cider-D. You might have changed something in Cider-D or to it's input args but forgot to commit that. Could you please check if your local version is same and the git version is updated to work with Self-critical-sequence-training (as it is also part of your paper).
Thank you, Srikanth
My modifications are adding two loss functions but not changing the original SCST loss. Are you using python 3.X? The python version may matter. You can try computing the Cider-D score with the 'corpus' mode to verify whether the error comes from the cached token.
You can try computing the Cider-D score with the 'corpus' mode to verify whether the error comes from the cached token.
@RubickH , looks like this works. But how do I fix the cached tokens issue, because I am just following your readme and new to this repo.
Also what is the difference between corpus and cached token. Does the corpus mode create groundtruth n-grams for the mini-batch on-the-fly, while the cached token is pre-computed for whole train-set?
Best Regards, Srikanth
Hi @RubickH, I am able to fix it using ruotian luo's repo: https://github.com/ruotianluo/self-critical.pytorch/blob/master/scripts/prepro_ngrams.py
I thought I made mistake by not cloning your repo recursively (for correct versions of cider-master). So I tried again with recursive cloning and rerunning the cached token, but looks like your processed cached token (with your script) is having issue. You can close this issue as it works with ruotian's preprocessing for ngram.
Hi @RubickH , I installed all the prereqs and also able to run eval.py (without any issues) using your provided checkpoint. But when I run train with the below command, it stops in the self_critical step after epoch 49 I think. Could you please help me to fix this issue.
------------Command-----------------
python train.py --self_critical_after 50 --id MADSAP0509 --caption_model lstm_MAD_SAP --save_checkpoint_every 500 --batch_size 600 --num_gpu 3 --gpu_id 0,1,2 --beam 0
------------ERROR Output-----------------------
iter 9440 (epoch 49), SAP_loss = 2.289, word_loss = 2.080, MAD_loss = 0.261 time/batch = 2.197 current_lr is 6.7108864e-05 Read and process data: 1.49842405319 iter 9441 (epoch 49), SAP_loss = 2.213, word_loss = 2.030, MAD_loss = 0.252 time/batch = 2.823 initializing CIDEr scorer... initlizing CIDEr scorers in 0.000029s current_lr is 5.36870912e-05 Read and process data: 0.961675882339 GPU time is : 1.09868407249s Traceback (most recent call last): File "train.py", line 399, in
train(opt)
File "train.py", line 244, in train
reward = get_self_critical_reward(gen_result,greedy_res,data['gts'])
File "train.py", line 72, in get_self_criticalreward
, cider_scores = CiderD_scorer.computescore(gts, res)
File "cider-master/pyciderevalcap/ciderD/ciderD.py", line 48, in compute_score
(score, scores) = cider_scorer.compute_score(self._df)
File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 199, in compute_score
score = self.compute_cider(df_mode)
File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 173, in compute_cider
vec, norm, length = counts2vec(test)
File "cider-master/pyciderevalcap/ciderD/ciderD_scorer.py", line 122, in counts2vec
df = np.log(max(1.0, self.document_frequency[ngram]))
KeyError: ('7', '9459')
Terminating BlobFetcher
Thank you, Srikanth