Open ghost opened 8 years ago
You can roll out your minibatch into a series of train. Summing deltas during substeps and then backprop accumulated gradient.
@nakosung Would you please give me the implementation details of this? Sorry, but I am very new to LSTM.
You can do substepping with solver.iter_size gt 1. Unfortunately, I don't have a public example for large LSTM.
I am trying to generate a brief description of input images in English. For that purpose, I am using CNN and LSTM. So far, I am done with CNN module, in which I get a 4096-dimensional vector as the output of fc7 of my caffemodel (VGG net of 16 layers). Also, if I add the SoftMax layer on top of CNN, I am able to get class labels as follows:
e.g. an image of a person with mobile in his hand sitting on a bed. So far, I get class labels like 'person', 'mobile', 'bed'.
Now, I wish to generate a sentence from these words or by using the 4096-feature vector I get as CNN output.