BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.03k stars 18.7k forks source link

How can I load own dataset (not images) to train a caffe model? #6819

Open Entel opened 5 years ago

Entel commented 5 years ago

I am a very new user of caffe. I just installed caffe and run the mnist example well. However, I don't quite understand how I can use caffe to implement my own dataset which is a list of features or vectors in same dimension? What should the data format be? Is it the same way to load data like image dataset?

Many thanks!!!! Any advice would be appreciated!

Hong-333 commented 5 years ago

Are you trying to learn from your own dataset?

Then you have to create an LMDB. Train with it

Entel commented 5 years ago

Are you trying to learn from your own dataset?

Then you have to create an LMDB. Train with it

Thanks for your reply. Excuse me, my dataset is a list like:

FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N
FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N
FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N
...

If I want to make a autoencoder to train this data, how can I make it into a LMDB? I know the LMDB format likes:

IMAGE_PATH  LABEL
IMAGE_PATH  LABEL
IMAGE_PATH  LABEL
...

Should I make the list into:

FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N   FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N 
FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N   FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N
FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N   FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N
...

Can I use a full connection layer as output? Thank you.

Hong-333 commented 5 years ago

Are you trying to learn from your own dataset? Then you have to create an LMDB. Train with it

Thanks for your reply. Excuse me, my dataset is a list like:

FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N
FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N
FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N
...

If I want to make a autoencoder to train this data, how can I make it into a LMDB? I know the LMDB format likes:

IMAGE_PATH  LABEL
IMAGE_PATH  LABEL
IMAGE_PATH  LABEL
...

Should I make the list into:

FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N   FEATURE_0_0, FEATURE_0_1, FEATURE_0_2, ... FEATURE_0_N 
FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N   FEATURE_1_0, FEATURE_1_1, FEATURE_1_2, ... FEATURE_1_N
FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N   FEATURE_2_0, FEATURE_2_1, FEATURE_2_2, ... FEATURE_2_N
...

Can I use a full connection layer as output? Thank you.

I think you need to make the feature data into a image file.

and make file list (.txt) like this

IMAGE_PATH LABEL

9/label9_77.jpg 9
9/label9_78.jpg 9
9/label9_80.jpg 9
9/label9_82.jpg 9
9/label9_83.jpg 9
9/label9_84.jpg 9
9/label9_85.jpg 9
9/label9_86.jpg 9
9/label9_91.jpg 9

and You can create an LMDB using convert_imageset

I don't know everything, so there may be other ways. :)