jcjohnson / densecap

Dense image captioning in Torch
MIT License
1.58k stars 432 forks source link

HDF5 file #73

Open davideaurucci opened 7 years ago

davideaurucci commented 7 years ago

Is it possible to find the HDF5 file somewhere? I would like to run and test the algorithm trained on the Visual Genome dataset that you also used in your paper. I read that the processing of such a dataset will produce a HDF5 file that is larger than 100GB. is this the real model that then must be trained or it is a kind of special representation of the data?

Moreover I would like to know how different is the pre-trained model that you provided here on github: how many images and region descriptions are you using here? Do you think that using the real Visual Genome dataset would make any significant differences?