nshaud / DeepNetsForEO

Deep networks for Earth Observation
Other
476 stars 170 forks source link

segnet_isprs_vaihingen_irrg.prototxt #14

Closed jorgenaya closed 7 years ago

jorgenaya commented 7 years ago

Dear, In the segnet_isprs_vaihingen_irrg.prototxt the value of the outputs is equal to 6, however the real number of outputs of the dataset is 7 ('imp_surfaces', 'building', 'low_vegetation', 'tree', 'car', 'clutter', 'unclassified').

So, I think that I don't understand somethin, because, I try to adapt your example to my dataset with 3 labels (A,B and 'unclassified), setting the value of the outputs to 2, and I obtain an error: 'error == cudaSuccess (77 vs. 0) an illegal memory access was encountered'

I try setting 3 in the outputs, and It works...but this is not what I want, because in this case I'm training the net with the unclassfied values, and I don't want to do this. I only want to train the net with A and B labels.

I try to create the dataset with nan values, but it crash in the lmdb creation.

Is there any way to avoid the unclassified values in the net? Can you help me? Thanks in advance

nshaud commented 7 years ago

If you would closely, the examples are designed so that you never train with unclassified values. In the ISPRS dataset, there are 6 classes, 'unclassified' is not one of them.

Now, as far as I understand, you want to perform binary classification with undefined labels. So, what you should have is ignore_label: 2 in your prototxt file (assuming that 0 is 'A', 1 is 'B' and 2 is 'unclassified').

However, the number of outputs in the last convolutional layer has to match the number of labels, even if you ignore the last label. So, if you want to train with A, B and undefined, your final layers should be like those :

layer {
  name: "conv1_1_D"
  type: "Convolution"
  bottom: "conv1_2_D"
  top: "conv1_1_D"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  convolution_param {
    num_output: 3 ### because you have 3 labels = A, B, undefined
    pad: 1
    kernel_size: 3
    weight_filler {
      type: "msra"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "conv1_1_D"
  bottom: "label"
  top: "loss"
  ignore_label: 2 ### Do not compute the loss (and do not train) on the 'undefined' labels
}

I hope this is clear enough.

jorgenaya commented 7 years ago

Yes, it's clear. I had tried something similar before, but it fails building the net, because "caffe.LayerParameter has no field named "ignore_label".

So, what I tried now is the following:

layer { name: "conv1_1_D" type: "Convolution" bottom: "conv1_2_D" top: "conv1_1_D" param { lr_mult: 1 decay_mult: 1 } convolution_param { num_output: 3 ### because you have 3 labels = A, B, undefined pad: 1 kernel_size: 3 weight_filler { type: "msra" } } } layer { name: "loss" type: "SoftmaxWithLoss" bottom: "conv1_1_D" bottom: "label" top: "loss" accuracy_param { ignore_label: 2 ### Do not compute the loss (and do not train) on the 'undefined' labels } }

Is this correct?

jorgenaya commented 7 years ago

After all, i suppose that I also have to modify the net in the inference step. Is this true?

nshaud commented 7 years ago

Yes, but it's not mandatory. What's important is that you do not train on unclassified labels. In the inference steps, unclassified labels should not be predicted anyway, although you can add the "ignore_label" parameter in the inference network so that you compute the accuracy and loss without taking those pixels into account.