Open Kaido0 opened 7 years ago
There are 4 major files.
model.py contains model definitions, you can use existing models or you can define your own one. train.py is the training script. Most parameters are set in the main function, and data augmentation parameters are at line 68 where SegDataGenerator is initialized, you may change them according to your needs. inference.py is used for infering segmentation results. It can be directly run and it's also called in evaluate.py evaluate.py is used for evaluating. It will save all segmentation results as images and calculate IOU. I'm sorry that the outputs are not well formatted and you may need to look into the code to see what's the meaning. Same as train.py, most parameters of inference.py and evaluate.py are set in the main function.
Here for your question, you need a training list file and a testing list file, in which each line has a image file path and a ground truth file path, and you set corresponding parameters (train_file_path, val_file_path, data_dir, label_dir) and other parameters as you need, then you can run train.py to start training.
@aurora95 I think he wanted to know how to make his own label_datasets, especially the class_mode from the labels.
Thanks @SCP-173-cool , but what do you mean by class_mode?
@aurora95 There are 21 classes in VOC2012, and how can i do if i make a new datasets including only 3 classes?
@SCP-173-cool I think your dataset already has some ground truth images, where pixel values represent the class of each pixel. You can change nb_classes from 21 to 3 and change file paths, then you should be able to train on the new dataset. Note that I set default ignore label as 255, if it is different on your dataset you should set it explicitly.
@aurora95 thanks a lot !
@SCP-173-cool Ah sorry I just realized that I didn't pass the nb_classes to model defining functions...... I'll fix it. Before that, you need to change the output dimension of the last layer manually...
Thank you so much!I will try it~
@aurora95 My label_datas are binary images , where pixel values are 0, 1. How can I make reasonable labels to fit the model ?
@SCP-173-cool Do you mean your labels are in one-hot encoding? If so, you can either change them into sparse style like 0, 1, 2, 3... or change the loss function and accuracy function.
For two classes you should probably be using sigmoid, which just gives a score between 0 and 1. As for the specific error... I'm not sure I've trained on two classes before, try printing the shape at different steps with 21 and 2 classes and see when they first differ, that will probably identify where your problem is.
You have something that works & something that doesn't so use debugging skills to figure out when they split from both working to only one working. :-)
@ahundt Many thanks! I followed precisely your advice, deployed my debugging skills, and now it works! Sorry for taking your time!
@IvanEz great! I'm glad that worked. :-)
Hi @SCP-173-cool , did you succeed in your experiments? Hi @aurora95 and @ahundt , did you tried this code for binary segmentation? I have the same problem as @SCP-173-cool. I have a small dataset with 30 gray-scale images and its corresponding ground-truths that are binary images (0 = background, 255 = foreground). I have set ignore_label to None, and the images are converted to RGB. I changed some parameters (e.g. number of classes, and some paths), the training process seems to be working well, but the inference doesn't work. I don't know why. The output image is always black. After data augmentation, I have 580 images for training. Please, see: My train.py: https://gist.github.com/andrewssobral/6ec0f48916894f9bdbe6f82dbe998424 My inference.py: https://gist.github.com/andrewssobral/0448f87258125d10459cf5b6f4bdd996 This is the output of my tensorboard: https://ibb.co/grPsuk This is a sample of the training image: https://ibb.co/k3ULfQ and its corresponding label: https://ibb.co/nk2vEk Thanks, Andrews
probably not enough data. Also, if one class such as background dominates class weighting may help escape the trap, assuming there is enough data.
@andrewssobral Have you checked the output in digits? Might be, the output image consists of 0s and 1s, instead of 0s and 255s, and thus you don't visually see the difference.
@ahundt
if one class such as background dominates class weighting may help escape the trap, assuming there is enough data
- that's very true, but is the issue with class weight support for 3+ dimensional targets resolved? If so, could you please say what should be modified in the code to assign class weight
a value other than None
? Thanks
@IvanEz I haven't used class weight myself yet so I can't comment on that accurately.
First of all thank you for all the info provided in this issue. I have one question regarding training with another dataset. The BilinearUpsampling2D function is expecting a fixed input size, right? I would like to know if it is possible to train with variable input sizes out-of-the-box but I think it will require some modification.
On the other side, I have another quick question even though I don't know if this is the right place to ask. Is this project currently being developed or has it being discontinued?
It works fairly well and I merge any good pull requests people submit, plus I've been adapting bits and pieces into keras-contrib as I can. That said, to my knowledge nobody is making big changes to the code base at this time.
@ahundt thanks for the prompt reply. I was asking mainly because I will most likely be contributing to the project :)
Regarding my question about the BilinearUpsampling2D it's expecting a specific upsampling factor or a target size, right? So out-of-the-box is not possible to train with variable sizes.
I'm currently learning about the backwards convolution and planning to use this code for a project with variable input sizes, so if I can contribute in any manner, I will.
If I recall correctly I believe it does train on variable sizes. The upsampling is just because some of the networks downsample by a factor of 2 or 4, so the upsampling gets you back to the original size.
I am trying to create the network with input_shape=(None, None, 3)
because of the variable input size with the following code:
model = FCN_Vgg16_32s(input_shape=(None, None, 3), classes=2)
and when instantiating the BilinearUpsampling2D layer I get the following error:
X.set_shape((None, original_shape[1] * height_factor, original_shape[2] * width_factor, None))
TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'
No matter if using the size from the input or the target_size
argument, there is a point in which the tf.image.resize_bilinear
requires a size
parameter which is a 1D tensor of in32.
Correct me if I'm wrong, maybe I'm not using it properly but I'm not sure if it's possible to use input variable size without modifying the code.
On the other hand, if I crop my dataset to a fixed size and specify the size I can build the network.
How to label the images?
@dokutagero Yes, you need to specify input shape this input_shape=(None, None, 3)
will not work with BilinearUpsampling2D
layer.
@ahundt @aurora95
Did models in this repo can be trained successfully without class weighting in case of unbalanced class distribution (as I understand for example in Pascal VOC person
class is overrepresented)?
@dokutagero what's the purpose of target_size? what if I just want to use the original size of my data? And I saw there's is image_size in evaluate.py, what does it do? it seems not same as the target_size in train.py
@simonsayshi you may take a look here https://github.com/aurora95/Keras-FCN/issues/71
@mrgloom hi i'm not sure what you want me to see but I just set target_size as the size of my input image. @aurora95 @ahundt @andrewssobral BTW, what should I do if I want to do binary classification exactly? is that enough just changing loss function as binary binary_crossentropy_with_logits?
I am new to Keras,and github.And I do not know how to run this code.Using this code to image segment.I have training set and labels.it is a lot of jpg image.How can I use them.Thank you very much.