torrvision / crfasrnn

This repository contains the source code for the semantic image segmentation method described in the ICCV 2015 paper: Conditional Random Fields as Recurrent Neural Networks. http://crfasrnn.torr.vision/
Other
1.34k stars 460 forks source link

Train crfasrnn with 2 classes #21

Closed thuanvh closed 8 years ago

thuanvh commented 8 years ago

Hi all, I am trying to train my own images with crfasrnn. My classes are background and people. But my training does not converge, the loss values do not decrease. And the prediction output of my network is only background (class 0) for all image. I think that I have a mistake in my training. Have you ever met the similar case? Could you give me any suggestion?

Thank you, Thuan

martinkersner commented 8 years ago

Hi!

I think that the problem could be anywhere.

Martin

thuanvh commented 8 years ago

Hi @martinkersner

I use the crfasrnn trained network to segment my images into 2 classes (background and foreground). My data is scaled between -1 and 1. Input size is 250x250 instead of 500x500. Then I train data by customizing train_val.prototxt. I change num_output of some convolution layers from 21 to 2. I add weight filler into each Convolution and Deconvolution parameter

    weight_filler {
      type: "xavier"
      std: 0.1
    }
    bias_filler {
      type: "constant"
      value: 0.2
    }

Here my output:

I0120 17:01:35.764663  7444 solver.cpp:473] Iteration 10000, lr = 1e-010
I0120 17:04:47.148594  7444 solver.cpp:213] Iteration 10100, loss = 51121.4
I0120 17:04:47.159598  7444 solver.cpp:228]     Train net output #0: loss = 59859.5 (* 1 = 59859.5 loss)
I0120 17:04:47.161098  7444 solver.cpp:473] Iteration 10100, lr = 1e-010
I0120 17:08:00.796912  7444 solver.cpp:213] Iteration 10200, loss = 52975.5
I0120 17:08:00.798411  7444 solver.cpp:228]     Train net output #0: loss = 55693 (* 1 = 55693 loss)
I0120 17:08:00.799911  7444 solver.cpp:473] Iteration 10200, lr = 1e-010
I0120 17:11:11.810928  7444 solver.cpp:213] Iteration 10300, loss = 52339.4
I0120 17:11:11.825934  7444 solver.cpp:228]     Train net output #0: loss = 50822.4 (* 1 = 50822.4 loss)
I0120 17:11:11.834436  7444 solver.cpp:473] Iteration 10300, lr = 1e-010
I0120 17:14:22.996568  7444 solver.cpp:213] Iteration 10400, loss = 51125.5
I0120 17:14:22.998069  7444 solver.cpp:228]     Train net output #0: loss = 63236 (* 1 = 63236 loss)
I0120 17:14:23.001070  7444 solver.cpp:473] Iteration 10400, lr = 1e-010
I0120 17:17:36.604817  7444 solver.cpp:213] Iteration 10500, loss = 51211.2
I0120 17:17:36.608319  7444 solver.cpp:228]     Train net output #0: loss = 38913 (* 1 = 38913 loss)
I0120 17:17:36.613320  7444 solver.cpp:473] Iteration 10500, lr = 1e-010
I0120 17:20:48.551285  7444 solver.cpp:213] Iteration 10600, loss = 51850.9
I0120 17:20:48.580294  7444 solver.cpp:228]     Train net output #0: loss = 46278.5 (* 1 = 46278.5 loss)
I0120 17:20:48.581795  7444 solver.cpp:473] Iteration 10600, lr = 1e-010
I0120 17:23:59.443377  7444 solver.cpp:213] Iteration 10700, loss = 50562.2
I0120 17:23:59.444877  7444 solver.cpp:228]     Train net output #0: loss = 50714.7 (* 1 = 50714.7 loss)
I0120 17:23:59.445876  7444 solver.cpp:473] Iteration 10700, lr = 1e-010
I0120 17:27:09.559470  7444 solver.cpp:213] Iteration 10800, loss = 50792.1
I0120 17:27:09.560971  7444 solver.cpp:228]     Train net output #0: loss = 53939.8 (* 1 = 53939.8 loss)
I0120 17:27:09.562472  7444 solver.cpp:473] Iteration 10800, lr = 1e-010
I0120 17:30:19.621381  7444 solver.cpp:213] Iteration 10900, loss = 50344.1
I0120 17:30:19.622881  7444 solver.cpp:228]     Train net output #0: loss = 67609.6 (* 1 = 67609.6 loss)
martinkersner commented 8 years ago

"Data is scaled between -1 and 1." What data you mean, images or labels?

I don't understand why you scale your data between -1 and 1. Even though this does not have to affect (I guess) your training, common practice is to have images within range 0-255 (and subtract their mean during training). Labels are usually denoted as integers in range 0-N, where N is number of classes - 1.

Because you just use weight filler and not weights already obtained by fcn-8 or crfasrnn, training is certainly going to take long time. Static (class 0) output of your network is likely to be caused by wrong weights initialization.

You can check my repo https://github.com/martinkersner/train-CRF-RNN.

thuanvh commented 8 years ago

I scale only images. Labels are 0 and 1.

I tried to use crfasrnn weights as in this file train_val.prototxt.txt

I compare it with your file, it is the same except the weight_filler. Don't you use filler for the new convolution layers?

martinkersner commented 8 years ago

I tried to train without fillers, just with weights from crfasrnn and what you can see in solve.py. Anyway, I should try it with them, because after 35 thousand of iterations it seems to me that I get worse results, however the loss is slightly decreasing.

If you still have problem with your unchanging predictions, don't train more than 500 iterations. I get pretty reasonable results even at such early training. Personally, I would guess that you have some problem with your training data.

thuanvh commented 8 years ago

After review the solve.py. I think the problem is that I did not initialize weights for Deconvolution layers as in solve.py. I used only caffe build for training, not used solve.py. Now I will try to use it. Thank you so much, Thuan

tybxiaobao commented 8 years ago

@thuanvh Hi, have you solved your problem? And how about the accuracy for two classes (i.e. background and people) labeling?

thuanvh commented 8 years ago

The problem is solved. I am collecting data for training. I have no measure now.

Sam813 commented 6 years ago

@thuanvh I have the same problem and I know it has been a long time from this post, but may I know how did you solve the problem? I have medical images(CT) and I want to use this for segmentation of tumor so I have just two class of background and tumor, After training all the predictions from the network is black.