Closed 13331151 closed 7 years ago
@kmyid
Hi Jack,
Could you tell me how your data set for training descriptor network generate?
We do a similar process. However, in our case, we use the actual raw SIFT points detected at each image, not the reprojected points. We crop 6 times the scale at the feature point location, which is the same area the SIFT descriptor looks at.
And could you tell me the validation err you get when training the descriptor network(mine is about 2.1)?
I am not really sure if I can give you a value that can be compared, as we have multiple constants multiplied to balance the positives and negatives. One very important thing is to apply hard mining, depending on the data. And this hard mining should progressively increase as you proceed with learning. Have a look at Eduard's descriptor paper, as it is a paper specifically on this learning strategy.
Hope my answer helps!, Kwang
I'm very appreciate your help, Kwang. Your answer do help a lot! A little more question is, are you generate the data using Visual SfM's data. I'm new to Visual SfM, and I fail to find the file which stores information about structure points(including their scale, position and orientation in corresponding image pairs).
What I can retrieve now: from *.sift: [x, y, color, scale, orientation] from .nvm: information(without scale and orientation) about structure as well as its corresponding image IDs.
And below is the loss function of the key-point training. Is it the same as what you describe in the paper?:
prediction1_class = theano.log(lasagne.layers.get_output(layers[0]["kp-scoremap"], deterministic=False))
prediction1_class = lasagne.nonlinearity.softmax(prediction1)
prediction1_class = np.cast[floatX](1./6)* theano.tensor.nnet.relu((np.cast[floatX](1.) - prediction1))**2
prediction2_class = theano.log(lasagne.layers.get_output(layers[1]["kp-scoremap"], deterministic=False))
prediction2_class = lasagne.nonlinearity.softmax(prediction2)
prediction2_class = np.cast[floatX](1./6)* theano.tensor.nnet.relu((np.cast[floatX](1.) - prediction2))**2
prediction3_class = theano.log(lasagne.layers.get_output(layers[2]["kp-scoremap"], deterministic=False))
prediction3_class = lasagne.nonlinearity.softmax(prediction3)
prediction3_class = np.cast[floatX](1./6)* theano.tensor.nnet.relu((np.cast[floatX](1.) - prediction3))**2
prediction4_class = theano.log(lasagne.layers.get_output(layers[3]["kp-scoremap"], deterministic=False))
prediction4_class = lasagne.nonlinearity.softmax(prediction4)
prediction4_class = np.cast[floatX](3./6)* theano.tensor.nnet.relu((prediction4np.cast[floatX](1.)))**2
loss_class = prediction1_class+prediction2_class+prediction3_class+prediction4_class
loss_class = lasagne.objectives.aggregate(loss_class, mode='mean')
prediction1 = lasagne.layers.get_output(layers[0]["desc-output"], deterministic=False)
prediction2 = lasagne.layers.get_output(layers[1]["desc-output"], deterministic=False)
loss_pair = theano.tensor.sum((prediction1-prediction2)**2+1e-7, axis=1)
loss_pair = lasagne.objectives.aggregate(loss_pair, mode='mean')
loss = loss_class+loss_pair
params = lasagne.layers.get_all_params(layers[0]["kp-scoremap"], trainable=True)
print ("Kp-output params: " , params)
updates = lasagne.updates.sgd(loss, params, np.cast[floatX](config.learning_rate))
myNet.train_ori_stochastic = theano.function(inputs=[], outputs=loss,\
givens=givens_train, updates=updates)
Thanks again! @kmyid
I think it's better if @etrulls answers this :-)
fail to find the file which stores information about structure points(including their scale, position and orientation in corresponding image pairs).
And below is the loss function of the key-point training. Is it the same as what you describe in the paper?:
You also need to include the overlap loss in the pre-training phase at least. In case of the class loss, I think it's similar to what we did. You also need a hyper parameter to balance loss_class
and loss_pair
. This parameter should be data-dependent.
Sorry about the delay, I wasn't receiving issue notifications. Extracting patches from the NVM and SIFT files is quite easy, this does most of the work: https://github.com/jheinly/visual_sfm_support (it's mostly self-explanatory)
You should be able to retrieve the SIFT keypoints used by the reconstruction, and from there you can extract the patches from the original images.
Thanks for your reply, I will check it out immediately :)
Hi, Sorry to border you guys, but I really wonder how to extract the data to train the model mentioned in LIFT. In the paper, you say Roman Forum has 1.6k images and 51k unique points, but the dataset I downloaded has 7k images. And even after VisualSfM's 3D reconstruction, there are still 1.8k images remained, and 400k unique 3D points in all nvm files. Seeing the bad performance of my trained model, I'm thinking if I did something wrong or different from you.
I generated nvm files in these manner: Start VisualSfM->Open multiple images->Choose all images in Roman Dataset(7k in total)->Compute missing matches->Compute 3D reconstruction->Save NView Matches->Then I got 22 nvm files for 22 different scene->I parsed each nvm file and below is my parsing log, you can see that the number of the points is very large...Could you tell me where I did wrong, please? Thank you so much. @kmyid @etrulls
Done loading ../data/TrainingData/Roman_Forum/roman1.nvm
Done loading ../data/TrainingData/Roman_Forum/roman2.nvm
Done loading ../data/TrainingData/Roman_Forum/roman3.nvm
Done loading ../data/TrainingData/Roman_Forum/roman4.nvm
Done loading ../data/TrainingData/Roman_Forum/roman5.nvm
Done loading ../data/TrainingData/Roman_Forum/roman6.nvm
Done loading ../data/TrainingData/Roman_Forum/roman7.nvm
Done loading ../data/TrainingData/Roman_Forum/roman8.nvm
Done loading ../data/TrainingData/Roman_Forum/roman9.nvm
Done loading ../data/TrainingData/Roman_Forum/roman10.nvm
Done loading ../data/TrainingData/Roman_Forum/roman11.nvm
Done loading ../data/TrainingData/Roman_Forum/roman12.nvm
Done loading ../data/TrainingData/Roman_Forum/roman13.nvm
Done loading ../data/TrainingData/Roman_Forum/roman14.nvm
Done loading ../data/TrainingData/Roman_Forum/roman15.nvm
Done loading ../data/TrainingData/Roman_Forum/roman16.nvm
Done loading ../data/TrainingData/Roman_Forum/roman17.nvm
Done loading ../data/TrainingData/Roman_Forum/roman18.nvm
Done loading ../data/TrainingData/Roman_Forum/roman19.nvm
Done loading ../data/TrainingData/Roman_Forum/roman20.nvm
Done loading ../data/TrainingData/Roman_Forum/roman21.nvm
Done loading ../data/TrainingData/Roman_Forum/roman22.nvm
This the link where I downloaded the Roman Dataset.
Hi Jack,
As ICCV is approaching, I think I won't have much time to answer you. I'll try to get back to you as soon as I can!
Cheers, Kwang
Wow, I am very expected to see your new work and I wish you have a great success in ICCV~ :)
Hi Jack,
I believe I am a bit late now. We are working on releasing the training part as well. Hopefully soon. This time, it will be tensorflow.
Hi, I'm Jack. I recently trained a model of descriptor network but it didn't work well. Could you tell me how your data set for training descriptor network generate? And could you tell me the validation err you get when training the descriptor network(mine is about 2.1)? My process:
That's what I get for example:
Thanks!!! :)