Closed dniku closed 9 years ago
+1 very needed feature
I'm starting on this now. Do you guys see a need for arbitrary numbers of databases per phase (train/val/test)? I can hardcode it to two databases per phase: images
and labels
, but I'm thinking people might have more complicated use cases. Maybe they would want some complicated mix of LMDBs including images
, coarse_classifications
, fine_classifications
, segmentations
, etc.
Are there any use cases you know of that would make this a worthwhile feature?
Bounding box estimation, for example. 4 integers to encode one box.
Thanks for the response! After reading back through these comments, maybe I should explain how I'm implementing this:
data
database contains unlabeled images (datum.label
isn't used).labels
database contains an N-dimensional label for the image. It can be Kx1
for a classification, or 4x1
for a single bounding box, or HxWxK
for a per-pixel segmentation, etc.data
database.These are the types of problems which may require more than two databases:
My question is about whether anyone is using models which perform these more complex types of tasks.
Hello Luke, You guys are discussing about the features like "round color black tshirt" or "full sleeve black and white formal shirt"?
How can I train train for this? is it possible now with caffe-DIGITS?
Regards, Saneesh.
After reading through BVLC/caffe#523, BVLC/caffe#1414 and BVLC/caffe#1698, I think this quote best sums up what is possible with Data
layers.
Caffe is perfectly happy with models that make matrix outputs and learn from matrix ground truths for problems where the output and truth have spatial dimensions e.g. reconstruction / de-noising, pixelwise semantic segmentation, sliding window detection, and so forth. The forward and backward passes for these models follow directly from the definitions and Caffe has always been capable of computing these. https://github.com/BVLC/caffe/issues/1698#issue-53768814
So I'll revise my previous statement and say that the labels
database contains 1-, 2- or 3-dimensional labels. Not "N-dimensional". That's because you have to use Datum
in LMDBs, which is restricted to 3 dimensions. It may be easier to work with arbitrary blobs using HDF5Data
layers.
It has come to my attention that HDF5DataLayer can now produce multiple blob outputs https://github.com/BVLC/caffe/pull/1414#issuecomment-69285965
For the current iteration, I'm just going to assume that you've created your LMDBs manually and want to use DIGITS for running caffe. In the next iteration, I'll tackle picking a standard input data format so that DIGITS can create your LMDBs or HDF5 files or whatever.
@Saneesh
You guys are discussing about the features like "round color black tshirt" or "full sleeve black and white formal shirt"?
I'm discussing a much more general solution which solves many more problems. It will be a little overkill for the simple multi-labeled example you gave, but will definitely be sufficient.
How can I train train for this? is it possible now with caffe-DIGITS?
It's possible with Caffe, but see BVLC/caffe#1698 (it's hard). I'm currently working on adding it to DIGITS.
@lukeyeager Thank you very much! Can we expect this feature in the version DIGIT 3? How long it will take to finish? and How will be informed about to the DIGIT users?
Regards, Saneesh
Hi, @lukeyeager. How is this issue evolving? Is the work in progress in a specific branch?
I should have something pushed to master
within a week or two.
My in-progress branch is at lukeyeager/generic-inference
. You can take a look at it if you want but beware - it could do bad things like corrupt your data from previous jobs.
Has the multiclass classification feature discussed here been implemented? What about the bounding box feature? These are very useful.
Currently DIGITS supports only a single integer label per image. For many applications, like regression or multiclass classification, this is not enough. I would like to propose adding support for both of these features.
There are several problems with this.
(.+)\s+(\d+)\s*$
(path/to/image 123
). This could be replaced with(.+)((?:\s+\d+(?:\.\d*))+)\s*$
to check for a list of ints or floats (path/to/image 123 4. 5.67
).Datum
structs. The problem is that aDatum
has a field for a label, and that's a singleint
(proof). Currenlty DIGITS dumps the class label into that field. There are at least three solutions to this, but none seem particularly easy:float_label
orint_labels
orfloat_labels
to Datum. Increases memory usage (not much) and changes a widely-used structure (very bad).I am currently working on patching
create_db.py
to support split databases, although I'm not sure that this is the best approach. Comments would be very much appreciated.