Closed j-min closed 3 years ago
Could you please use just one GPU to launch the experiments and see whether it will give you the same issue? I am always using one GPU which seems to be fine to train for a while.
I succeeded in training the CLIP-RN50 model with a single gpu. Below is the evaluation result on the Karpathy test split. Could you please confirm that you saw similar results? I'd like to make sure I ran the script correctly since there are only SCST results, without any scores of MLE-based training in the paper.
{'Bleu_1': 0.7468949383241016,
'Bleu_2': 0.5810452442634662,
'Bleu_3': 0.44466729659825666,
'Bleu_4': 0.34033720376175897,
'METEOR': 0.27332929031601105,
'ROUGE_L': 0.5538997707453996,
'CIDEr': 1.1062112636066934,
'SPICE': 0.20551119536241158,
'WMD': 0.5629711371394192,
'perplexity': 0.5224986911565065,
'entropy': 1.3382156711161137,
'SPICE_Relation': 0.05422546691107958,
'SPICE_Cardinality': 0.07816728167281671,
'SPICE_Attribute': 0.11556938213280309,
'SPICE_Size': 0.05356541268950027,
'SPICE_Color': 0.14668237505316156,
'SPICE_Object': 0.36858569451532297,
'bad_count_rate': 0.001}
Btw, there are some bugs in tools/eval.py
. I needed to
1) comment out the line from captioning.data.dataloaderraw import *
in , since there's no dataloaderraw.py
in captioning.data
.
2) and create a vis/
directory to avoid error in the last lines:
if opt.dump_json == 1:
# dump the json
json.dump(split_predictions, open('vis/vis.json', 'w'))
I also tried to run SCST with cider with the command in the readme but faced this error.
Traceback (most recent call last):
File "tools/train.py", line 293, in <module>
train(opt)
File "tools/train.py", line 154, in train
init_scorer(opt.cached_tokens)
File "/scratch-space/CLIP-ViL/CLIP-ViL-Direct/caption/captioning/utils/rewards.py", line 27, in init_scorer
CiderD_scorer = CiderD_scorer or CiderD(df=cached_tokens)
File "cider/pyciderevalcap/ciderD/ciderD.py", line 28, in __init__
self.cider_scorer = CiderScorer(n=self._n, df_mode=self._df)
File "cider/pyciderevalcap/ciderD/ciderD_scorer.py", line 80, in __init__
pkl_file = cPickle.load(open(os.path.join('data', df_mode + '.p'),'rb'), **(dict(encoding='latin1') if six.PY3 else {}))
FileNotFoundError: [Errno 2] No such file or directory: 'data/coco-train-idxs.p'
It seems like coco-train-idxs.p
is related to cached_tokens
argument in opts.py
. Do we need this file before starting SCST?
parser.add_argument('--cached_tokens', type=str, default='coco-train-idxs',
help='Cached token file for calculating cider score during self critical training.')
I found coco-train-idxs.p
from the ImageCaptioning.pytorch author's Gdrive. Could you please confirm if I can use this file?
Sure, sorry I should put the coco-train-idxs.p
in the original repo and the MLE results seem to be close to what we have.
Thanks for your participation, I added more clarification in the README file.
@j-min Have you successfully re-implemented the results of CLIP features for image caption based on R50?
In this paper, the result of B@4 is 38.6, but I only get the results of 32.0 as follows.
Hi @liujiaheng, is the result for the 5000 karpathy split and using the two phrase training?
@sIncerass Can you provide the results on 5000 karpathy split of phase1 training based on R50 with CLIP features?
{'Bleu_1': 0.7514197530864043, 'Bleu_2': 0.5850870427589987, 'Bleu_3': 0.43979903834664746, 'Bleu_4': 0.3263438183989057, 'METEOR': 0.27168378207273514, 'ROUGE_L': 0.553913453445984, 'CIDEr': 1.092061834034407, 'SPICE': 0.20618343143439724, 'WMD': 0.5600404048324764, 'perplexity': 0.6155193274050951, 'entropy': 1.6811169483423234, 'SPICE_Relation': 0.06140779121218033, 'SPICE_Cardinality': 0.13671586715867157, 'SPICE_Attribute': 0.09548424840112993, 'SPICE_Size': 0.045554931686318544, 'SPICE_Color': 0.10765492310436131, 'SPICE_Object': 0.378444525407513, 'bad_count_rate': 0.0036} These are the results I have reimplemented.
Hi @liujiaheng, here is the detailed evaluation result I got
{'Bleu_1': 0.8007510865437192, 'Bleu_2': 0.6449325885322664, 'Bleu_3': 0.5008726056265775, 'Bleu_4': 0.381525622275518, 'METEOR': 0.2866526250312671, 'ROUGE_L': 0.583230803123437, 'CIDEr': 1.2582568357814914, 'SPICE': 0.2246589023419364, 'WMD': 0.26410401436038744, 'perplexity': 0.07665397078062525, 'entropy': 0.17617339862971568, 'SPICE_Relation': 0.06592518994248969, 'SPICE_Cardinality': 0.19065190651906516, 'SPICE_Attribute': 0.11274409396317843, 'SPICE_Size': 0.037413438143365146, 'SPICE_Color': 0.13960399775006516, 'SPICE_Object': 0.40403633034320335, 'bad_count_rate': 0.0006}
Hi, I followed the data preparation and ran the training script for the default CLIP-RN50 model in the readme. However, the training job crashes with the log below. Could you please check if the current example training script is runnable?