Questions about the implementation - Githubissues

grazder / Image-Captioning-Inference

3 stars 1 forks source link

Questions about the implementation #1

Open wentianli opened 3 years ago

wentianli commented 3 years ago

Thanks a lot for your work! It really saved me a lot of time.

I suspect that the input image should be BGR instead of RGB. https://github.com/grazder/Image-Captioning-Inference/blob/9d2d4a08475e24bd06a9e5766bdd13c570791a87/captioning/data/dataloaderraw.py#L45 When tested on car.jpeg, RGB input gives "old truck" whereas BGR input gives "blue truck". The latter has the correct color.

I ran the code with pytorch==1.4. I found that in order to run properly on gpu, the option here should be "cuda" instead of "gpu". https://github.com/grazder/Image-Captioning-Inference/blob/9d2d4a08475e24bd06a9e5766bdd13c570791a87/Captions.py#L30

Correspondingly, these two lines have to be changed to fc_batch.append(tmp_fc.cpu()) att_batch.append(tmp_att.cpu()) https://github.com/grazder/Image-Captioning-Inference/blob/9d2d4a08475e24bd06a9e5766bdd13c570791a87/captioning/data/dataloaderraw.py#L48 https://github.com/grazder/Image-Captioning-Inference/blob/9d2d4a08475e24bd06a9e5766bdd13c570791a87/captioning/data/dataloaderraw.py#L49

grazder commented 3 years ago

Thank you for the feedback! Yeah, I made this project for CPU purposes and haven't tried to run on GPU yet. I will test on GPU and try your suggestions soon!