kelvinxu / arctic-captions

960 stars 349 forks source link

'sample = empty string' when trying to train flickr_30k #16

Closed noureldien closed 8 years ago

noureldien commented 8 years ago

Thanks for the sharing the implementation. Please, anybody know why when training, I get the 'truth' = empty string, here is the output of the debugger:

Epoch 1 ... Truth 0 : UNK UNK with UNK white shirt aims UNK dart at an UNK target as several other people holding darts look on beside her UNK Sample ( 0 ) 0 : ... Truth 1 : UNK man UNK along with three others UNK is standing in snowy and cold weather with only green shoes and black shorts on UNK Sample ( 0 ) 1 : ... Truth 2 : bald man in what looks like to be UNK lab with UNK lady watching as he is mixing some liquids in clear cup containers Sample ( 0 ) 2 : ... Truth 3 : UNK UNK in UNK big purple hat and UNK long blue coat and UNK man with UNK satchel spending time in the city UNK Sample ( 0 ) 3 : ... Truth 4 : UNK man rides UNK skateboard down the railing of UNK staircase in front of UNK closed storefront UNK with three people watching him UNK Sample ( 0 ) 4 : ... Truth 5 : UNK group of dancers dressed in red and wearing short white tutus conversing UNK while UNK single dancer is practicing in the background UNK Sample ( 0 ) 5 : ... Truth 6 : UNK on UNK baseball field UNK one baseball player is sliding into UNK base while UNK player from the opposing team is jumping UNK Sample ( 0 ) 6 : ... Truth 7 : UNK female acrobat with long UNK blond curly hair UNK dangling upside down while suspending herself from long UNK red ribbons of fabric UNK Sample ( 0 ) 7 : ... Truth 8 : UNK man with UNK backpack is walking down UNK street while UNK man in an orange shirt pushes UNK cart the opposite direction UNK Sample ( 0 ) 8 : ... Truth 9 : UNK football game is going on UNK with player on the field UNK squatting on the sidelines and standing UNK watching the game UNK Sample ( 0 ) 9 :

I extracted the CNN features of the images as illustrated here: https://github.com/kelvinxu/arctic-captions/issues/1

noureldien commented 8 years ago

Please help, please. I've spent the whole weak (> 10 hrs a day) trying to get it working with no hope!! I don't know why I'm getting the UNK word??? There must be a problem in creating the 'dictionary' Also, when training on 'coco' using 'evalaute_coco.py', this is what I get:

... Truth 0 : UNK UNK and UNK car UNK UNK by UNK UNK UNK car UNK Sample ( 0 ) 0 : ... Truth 1 : UNK cup half full UNK coffee UNK UNK UNK UNK UNK backlit UNK Sample ( 0 ) 1 : ... Truth 2 : grassy area UNK UNK UNK UNK UNK UNK UNK UNK UNK fire hydrant Sample ( 0 ) 2 : ... Truth 3 : UNK UNK UNK driving down UNK UNK UNK UNK UNK UNK another UNK Sample ( 0 ) 3 : ... Truth 4 : UNK bathroom UNK green UNK UNK UNK and black and UNK UNK UNK Sample ( 0 ) 4 : ... Truth 5 : UNK UNK UNK UNK UNK UNK UNK ball UNK UNK ground UNK UNK Sample ( 0 ) 5 : ... Truth 6 : UNK UNK and UNK UNK child UNK UNK UNK bathroom brushing UNK UNK Sample ( 0 ) 6 : ... Truth 7 : UNK UNK UNK UNK UNK UNK horse UNK UNK children UNK UNK UNK Sample ( 0 ) 7 : ... Truth 8 : black and UNK UNK UNK UNK horse UNK beside UNK UNK UNK UNK Sample ( 0 ) 8 : ... Truth 9 : UNK UNK and chairs UNK UNK UNK grass UNK UNK UNK UNK UNK Sample ( 0 ) 9 : ... Epoch 1, Update: 100, Cost: 87.59

kelvinxu commented 8 years ago

Please send me your email, I'll email you the dictionary

noureldien commented 8 years ago

@kelvinxu Thanks for your reply. I emailed @ervecherish and got the dictionary.pkl from him. This solved the problem. It turned out that the method illustrated here (https://github.com/asampat3090/arctic-captions/blob/master/make_flickr_data.py) does NOT create the dictionary the correct way. Consequently, causing problems.

Thank you again for help and for providing your code on GitHub.

ozancaglayan commented 8 years ago

Hi,

I also have empty samples, e.g. beam search returns samples only with an (0). I think I correctly created the dictionary (eos=0, unk=1 and all the other words in descending frequency order). The validation loss is always approximately 55. Any ideas?

smt-HS commented 8 years ago

Hi, @noureldien , I met the same problem as you did. If possible, can you email me the dictionary.pkl? zhiwen.tom.tang@gmail.com

Thank you very much!!

xxxyyyzzzz commented 6 years ago

Hi, @noureldien , I have the same issue. If possible, can you email me the dictionary.pkl for coco dataset? janulucky97@gmail.com