stevehuanghe / image_captioning

Image captioning models in PyTorch
Apache License 2.0
37 stars 10 forks source link

Sorry, there are similar problems after redownloading the code. #3

Open 17000432 opened 5 years ago

17000432 commented 5 years ago

Sorry, there are similar problems after redownloading the code.This time the error is No such file or directory: './image/train2014/COCO_val2014_000000345411.jpg'.I don't understand why the image of verification set appears in the training set folder.May I ask what needs to be modified?Thanks.

If I may venture to ask,Is there a build_vocab.py missing a KarpathySplit.py file? Because vocab.pkl files can only be generated after I add them.

stevehuanghe commented 5 years ago

What script were you trying to run? If you are using default configurations, the image path should be like "./data/train2014/COCOxxxxxxxx.jpg". COCO_val2014_000000345411.jpg should be in the "./data/val2014" folder

As for your second question, I am sorry that I didn't get it. What is "KarpathySplit.py"?

17000432 commented 5 years ago

I am trying to run train.py.(model=ssa) .I rearranged the location of the image according to the configuration file, but still reported an error No such file or directory: './data/train2014/COCO_val2014_000000071597.jpg'. Thanks

This is my KarpathySplit.py:

coding: utf-8

Karpathy Split for MS-COCO Dataset

import json from random import shuffle, seed seed( 123 ) # Make it reproducible num_val = 5000 num_test = 5000 val = json.load( open('./data/annotations/captions_val2014.json', 'r') ) train = json.load( open('./data/annotations/captions_train2014.json', 'r') )

Merge together

imgs = val['images'] + train['images'] annots = val['annotations'] + train['annotations'] shuffle( imgs )

Split into val, test, train

dataset = {} dataset[ 'val' ] = imgs[ :num_val ] dataset[ 'test' ] = imgs[ num_val: num_val + num_test ] dataset[ 'train' ] = imgs[ num_val + num_test: ]

Group by image ids

itoa = {} for a in annots: imgid = a['image_id'] if not imgid in itoa: itoa[imgid] = [] itoa[imgid].append(a) json_data = {} info = train['info'] licenses = train['licenses'] split = [ 'val', 'test', 'train' ] for subset in split: json_data[ subset ] = { 'type':'caption', 'info':info, 'licenses': licenses, 'images':[], 'annotations':[] } for img in dataset[ subset ]:
img_id = img['id'] anns = itoa[ img_id ]
json_data[ subset ]['images'].append( img ) json_data[ subset ]['annotations'].extend( anns )
json.dump( json_data[ subset ], open( './data/annotations/karpathysplit' + subset + '.json', 'w' ) )

17000432 commented 5 years ago

I tried and still got an error No such file or directory: './data/train2014/COCO_val2014_000000071597.jpg'. May I ask how to modify it? Thank you very much.

stevehuanghe commented 5 years ago
  1. My code is not yet compatible with the Karpathy split, so the code only works with default COCO train and val split, but I will add the compatibility later.

  2. Since I am not sure how you changed the image paths, I can't see how you got that error, maybe it was because of the incompatable Karpathy split. Please check whether that image "./image/train2014/COCO_val2014_000000071597.jpg" really exists, since it should be in "val2014" if you use the default train/val split. Please make sure all images are in the folders that they should be in.