BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.14k stars 18.68k forks source link

Creating data for training Siamese network #2550

Closed swamiviv closed 7 years ago

swamiviv commented 9 years ago

I have a basic question regarding training a Siamese network. The MNIST example given in the doc loads data from binary files using the ./examples/siamese/create_mnist_siamese.sh script. In my case, I have the images stored on the disk (assume they are grayscale for now). Now, how do I create the corresponding leveldb database, ie. Is there a way to do this by creating train.txt type files ? Has anyone tried this ?

FlorisGaisser commented 9 years ago

Hi there! I'm new to Caffe, but I'm going to do something similar to you. I was actually planning to change the code from the convert_mnist_siamese_data.cpp. Here I thought about to use the convert_imageset.cpp (in tools) file as a source.

I'm not sure how to send you the code if I'm finished though. Should I make a seperate branch and do a PR?

bhack commented 9 years ago

@FlorisGaisser Yes a PR could be interesting @Wangyida It is working also for triplet loading and probably could contribute a PR /cc @mtamburrano

wangyida commented 9 years ago

I have already finished the triplet lmdb data generation code for MNIST data based on Siamese example as convert_mnist_triplet_data.cpp, I will make a PR. There are also more work needed to modify the losslayer in caffe/scr because there no loss function suitable for triplet input feature now @bhack @mtamburrano @FlorisGaisser

swamiviv commented 9 years ago

@FlorisGaisser That's great. For my purpose, I think the loss functionality can be implemented as in the siamese network tutorial and hence the changes I require are in the i/o. Can you possibly send a message to me quoting the changes or the code fragment that you modified to support a .txt file input ? Thanks a lot!

wangyida commented 9 years ago

All work for triplet training is created as a rough PR #2552 , you could download it there for reference @swamiviv @FlorisGaisser @mtamburrano

FlorisGaisser commented 9 years ago

I've written a small executable in the style of convert_mnist_siamese_data.cpp and added it into the same folder. I still have to test it though, but it's morning here, so I hope to get you a PR by the end of the day.

FlorisGaisser commented 9 years ago

I've finished my code and after some figuring out I discovered how to make a PR: https://github.com/BVLC/caffe/pull/2570 There will be some update soon, where I also provide a tool to create the imageset text file.

FlorisGaisser commented 9 years ago

There is an update, you can now generete the imageset text file in a few ways. @swamiviv Is this of use for you?

swamiviv commented 9 years ago

@FlorisGaisser Yes, that helps. Thanks a lot. I am not in a position to test it out now and I will do it in a few days.

FlorisGaisser commented 9 years ago

In the by mistake created PR: https://github.com/FlorisGaisser/caffe/pull/1 @swamiviv has commented that https://github.com/BVLC/caffe/pull/2570 is working for him.

I want to also point out I've created another PR: https://github.com/BVLC/caffe/pull/2608 that extents the imagedata layer and can load siamese pairs from a text file directly into the network without converting it to a leveldb or lmdb. @Wangyida it also supports tripplets.

wangyida commented 9 years ago

@FlorisGaisser Great, I will have a look into the label definition in your code my code about LMDB isn't so efficient

FlorisGaisser commented 9 years ago

@Wangyida I was also wondering how to do the labels, because at the moment I think only a single DType (float/doubles) is supported for each input (set). For the triplets I've used these labels in the test: 0: none the same 1: first two the same 2: first and third the same 3: second and third the same 4: all the same This is ofcourse assuming that it is going to be used with a contrastive loss layer.

But as the label is stored in a Blob it should be possible to make a label contain multiple values, by for example using the 'channels' dimension for each image.

wangyida commented 9 years ago

In the triplet training, I think as decribed in [1], there are no label needed in training, so your label 0, 2, 3 is not used for triplet training, and for label 4, the first two are more similar than the third one @pwohlhart My codes in PR #2603 using a simple mnist data for data construction, you can have a look on the code for creating lmdb data @FlorisGaisser [1] http://lrs.icg.tugraz.at/pubs/wohlhart_cvpr15.pdf

FlorisGaisser commented 9 years ago

@Wangyida I understand how you use the triplets and I've to say this paper is certainly interesting, thank you for making me aware of this one.

nassarofficial commented 8 years ago

Guys, can someone point me in the right direction. I have pairs of images ready (aerial image, and streetview image), I want to train them using siamese network, so afterwards I can get other pairs, and check if they are similar or not using the distance factor. How can I do this? Is each pair considered a separate class?

shelhamer commented 7 years ago

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues. Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.