ESIPFed / gsoc

Project ideas and mentor guidance for ESIP members to participate in Google Summer of Code.
Apache License 2.0
34 stars 16 forks source link

Ag-Net: building a customized deep neural network for recognizing crop categories based on spectral characteristics #13

Closed ZihengSun closed 5 years ago

ZihengSun commented 5 years ago

ESIP Member Organization

CSISS/LAITS, George Mason University Alaska Ocean Observing System (AOOS) and Axiom Data Science

Mentors

Ziheng Sun Jesse Lopez

Project Ideas

Ag-Net: building a customized deep neural network for recognizing crop categories based on spectral characteristics

Information for students

See ESIP general guidelines

Abstract

How many kinds of crops can you recognize? It is hard to say many. In most time of growing season, they are all green plants. Dent corn and sweet corn, black bean and red bean, barley and wheat, grass and weeds, etc. Distinguishing them takes a ton of knowledge and experiences. Agriculture scientists have struggled for years to figure out an automated way to recognize them. Deep learning is a powerful tool for non-linear classification problems. The critical part for deep learning is training dataset, which can be extracted from the reports and map products of U.S. department of agriculture. However, the existing deep neural networks are not performing as well as expected on crop classification because of their learned representation features in the back propagation are not common enough to tell the small differences among crops with similar external look. A customized network with special filters may help tell those minor differences in high spectral characteristics for more accurate recognition results.

Technical Details

Python; Keras; Geoweaver; numpy; scikit-learn; matplotlib; GDAL.

Helpful Experience

Machine learning knowledge; satellite image manipulation; python programming.

First steps

Start to get familiar with DeeplabV3, U-net or any other state-of-art deep neural network and test them on a sample training dataset.

Juhi-Purswani commented 5 years ago

Sir, I want to contribute in this project. Please help me to get started. Thanks.

ZihengSun commented 5 years ago

@Juhi-Purswani Sure. Could you please send me an email (zsun@gmu.edu) and we can start discussion. Thank you!

hdsingh commented 5 years ago

@ZihengSun Sir, I am also very interested in working on this project. I have sent you an email. Can you please check and guide me further? Thank you!

ZihengSun commented 5 years ago

@hdsingh Welcome! Thanks for your response! I have sent some tutorials for you to get started.

rrishabh145 commented 5 years ago

Hi @ZihengSun , I am very interested in working on the project and making a contribution to it. I have sent an email to you stating my interest in the project. Please check the mail and guide me for the project. Thank You!

AnirudhDagar commented 5 years ago

Hi! I'm really really interested in this project. This is very in line to one of my previous projects. Correct me if I'm wrong, the underlying task is to build a robust Semantic Image Segmentation model specific for agriculture land Classification.

I'll get started with the literature below and write summary blogs to keep a track of major contributions and the key points for each paper.

Rethinking Atrous Convolution for Semantic Image Segmentation U-Net: Convolutional Networks for Biomedical Image Segmentation

I've also sent you an email regarding a few doubts and specific questions, probably we can have detailed discussion on slack or through email.

ZihengSun commented 5 years ago

Thank you very much for your responses!

We are still waiting for Google official announcement on Feb 26. Before that, make sure you read the ESIP participation guidelines. Any additional questions please let @abburgess or me know.

We have a general tutorial for exercising deep learning in agriculture on FigShare. Hope it can help you get started.

swaroop-nath commented 5 years ago

Hello Sir @ZihengSun, I am really interested in contributing to this project. I have previously worked on projects demanding health status of plants which required a fair bit of knowledge in Image Processing, Classification and working with various standard python libraries. I have sent you a mail in order to get some context, and understand how to get started. In addition, I shall also be doing work on my side of understanding the technologies, including reading latest papers on Deep Learning. Thank You.

navyasingh002 commented 5 years ago

I am really interested in this project , and I have already been doing similar kind of work earlier on also........could you please guide me on how I can contribute.

esip-lab commented 5 years ago

Hi all - Quick reminder that we should hold off on discussing projects at-depth until February 26th; that is the date mentor organizations are officially announced. It's okay to say 'hi' though!

AB

Annie Burgess, PhD

Lab Director | Earth Science Information Partners (ESIP)

esipfed.org/lab http://esipfed.org/lab | 585.738.7549

Sign up for the monthly ESIP Lab update here http://eepurl.com/dtKL8z.

On Thu, Feb 14, 2019 at 8:12 AM navyasingh002 notifications@github.com wrote:

I am really interested in this project , and I have already been doing similar kind of work earlier on also........could you please guide me on how I can contribute.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESIPFed/gsoc/issues/13#issuecomment-463621707, or mute the thread https://github.com/notifications/unsubscribe-auth/AG_XwMlDaPRWW6mgJ94WnOGnVX46gQ4yks5vNWDFgaJpZM4afBCr .

aniruddhgoteti commented 5 years ago

Congrats ESIP!

As it is 26th February, I guess it is okay to discuss ideas now!

Hello Ziheng, Hope you are doing well. I was following this organisation and was quite thinking about the ideas to implement this NN. I have mailed them to you. Please let me know how you feel about them.

Thanks and Regards, Aniruddh

Harshit4199 commented 5 years ago

Hello, I want to contribute myself to this project and want to solve the issues. I have mailed to @ZihengSun and want further comments. Thank you!

1998at commented 5 years ago

Hello @ZihengSun I am interesting in contributing to the project.I would like to ask you about the dataset that we have to work upon

ZihengSun commented 5 years ago

I created a sample dataset for you to start. https://github.com/ZihengSun/Ag-Net-Dataset

AnirudhDagar commented 5 years ago

Thanks for this @ZihengSun, it will be really helpful :)

thePairedElectron commented 5 years ago

Hello @ZihengSun , I'm Aditya, final year engineering student, I would like to contribute to this project. Please guide me further.

PariyaPm commented 5 years ago

Hi @ZihengSun , I made my own implementation of U-net and another network called segNet on land change prediction problem (pixel wise semantic segmentation), my paper on that is in the final revisions. I think I'd be able to help on this project. Please let me know if you have patches of labeled data. If I can help at this project I would be glad to.

sinAshish commented 5 years ago

Hi @ZihengSun, I am interested in contributing towards the project. I have previous worked on a similar problem on Kaggle. I also a fair share of experience with segmentation modelling. Please let me know what do do further!

ZihengSun commented 5 years ago

Hi @PariyaPm ,

Thank you for the response. That was great progress! U-Net and SegNet are both promising networks for land cover classification. I have created a sample training dataset. I believe it is trainable for both nets (I tried it on SegNet a little bit). Let me know if you meet any problems when using them.

Best, Ziheng

ZihengSun commented 5 years ago

Hi @thePairedElectron Please take a look at the sample dataset and general tutorial above to help you get started.

sinAshish commented 5 years ago

UNet with attention gives better results than SegNet.

ZihengSun commented 5 years ago

@sinAshish It might be true in some cases. However, we need experiment results before making final conclusion statement in general scope.

sinAshish commented 5 years ago

ok, I'll post the experimental results here after my model gets trained!

rishi-s8 commented 5 years ago

I would like to contribute to this project. I have some experience with Deep Neural Networks, UNet, Autoencoders, Keras, PyTorch and Tensorflow and with training models. Can you help me get started?

sankalpmittal1911-BitSian commented 5 years ago

Respected Sir, I have sent you an email stating the previous experience I have in the field and have asked for help to get me started further. I have also stated that I have read about U-Net architecture and Image segmentation. I have also seen the dataset which you have posted here.

My email is f20150242@hyderabad.bits-pilani.ac.in.

Can you help me figure out what to do next?

Thank you.

sarat-svl commented 5 years ago

Hello sir@ZihengSun, I am currently pursuing my pg at IIIT-Hyderabad and I am really interested in working on this project. Can you provide more information about the project and resources so that I can get more familiar?

Mail: sarat.sristi@students.iiit.ac.in

ZihengSun commented 5 years ago

Hi, thank you for the responses! Please start with using the network you are familiar with like unet autoencoder, segnet, yolo, rcnn, etc to test run on the sample dataset above. From the results you will be able to see the drawbacks and think about how to customize the network to improve the accuracy. Write your thought into an application. Any difficulties please let me know.

sankalpmittal1911-BitSian commented 5 years ago

Sir, Can I submit you my work after some days (like 3-4) ?

1998at commented 5 years ago

@ZihengSun You mentioned having experimented with Segnets.Did You also Try to Experiment with number of Channels you were passing.Instead of passing all the 7 channels maybe Passing Each Channel Individually or Grouping 2-3 Channels together and then on final Prediction Taking a Weighted Average For The Final Segmentation Mask.I think that should Make The model converge More Fast

sinAshish commented 5 years ago

@ZihengSun is it ok to use the dataset provided by you on other platforms like kaggle or colab for GPU services?

sinAshish commented 5 years ago

A very very basic EDA of the sample dataset. Fellow participants would find it useful to see the various channels with the mask.

ZihengSun commented 5 years ago

@ZihengSun You mentioned having experimented with Segnets.Did You also Try to Experiment with number of Channels you were passing.Instead of passing all the 7 channels maybe Passing Each Channel Individually or Grouping 2-3 Channels together and then on final Prediction Taking a Weighted Average For The Final Segmentation Mask.I think that should Make The model converge More Fast

No, I didn't try that. It could be interesting. The visible bands are very limited in distinguishing crops in growing season. Fully taking advantage of all the 7 bands could make it easier for neural network to recognize them.

ZihengSun commented 5 years ago

A very very basic EDA of the sample dataset. Fellow participants would find it useful to see the various channels with the mask.

Thank you for sharing this!

sankalpmittal1911-BitSian commented 5 years ago

Sir, do you mean we may consider all channels at once, average them and pass the newly created dataset of images as training data? This is because #input images = 7*#output images. This may benefit the network I think. Will experiment on this and tell you the result in 3 days. Thank you.

1998at commented 5 years ago

@ZihengSun I think Averaging All channels to Combine Them Would Result in Loss Of lots of Information.I think Creating a model that can directly start with 7 Channels is an optimal solution.I am working on the same and would let you know the results soon

sankalpmittal1911-BitSian commented 5 years ago

@ZihengSun Sir, so I implemented the U-Net on the dataset you gave. The implementation was done by me on Google Colab. Hopefully the GPU was able to handle dataset read all at once. I also started with all the 7 channels directly (taking full effect of pixel values).

The code: https://colab.research.google.com/drive/1WflXioKXcg1SCi_rd2KR60Yq1i6ZO-xc#scrollTo=yNre1WZTmm4L

If you or anyone else is not able to access, please do let me know.

I have implemented the U-net from this paper: http://arxiv.org/abs/1505.04597

Problems I am facing:

  1. Learning is way too slow (even if I don't pay any heed to accuracy since I don't think it makes any sense in regression problems (we are generating an image)).

  2. Loss is way too much.

If anyone can see the code and give me suggestions on how to improve my model to minimize loss, it will be very helpful. I am using batch size = 256 and minimum learning rate = 0.00001. Also if anyone can implement the dataset (train) on any other network like DeepLabv3, and how it fares, it will be very beneficial to my learning.

Okay so I am posting losses (apologies if the post is way too big):

Train on 9478 samples, validate on 1673 samples Epoch 1/100 9478/9478 [==============================] - 79s 8ms/step - loss: 0.6259 - acc: 0.0000e+00 - val_loss: 0.6914 - val_acc: 0.0000e+00

Epoch 00001: val_loss improved from inf to 0.69144, saving model to model.h5 Epoch 2/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.6042 - acc: 0.0000e+00 - val_loss: 0.6110 - val_acc: 0.0000e+00

Epoch 00002: val_loss improved from 0.69144 to 0.61097, saving model to model.h5 Epoch 3/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.6033 - acc: 0.0000e+00 - val_loss: 0.6176 - val_acc: 0.0000e+00

Epoch 00003: val_loss did not improve from 0.61097 Epoch 4/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5981 - acc: 0.0000e+00 - val_loss: 0.6384 - val_acc: 0.0000e+00

Epoch 00004: val_loss did not improve from 0.61097 Epoch 5/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5947 - acc: 0.0000e+00 - val_loss: 0.6272 - val_acc: 0.0000e+00

Epoch 00005: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00005: val_loss did not improve from 0.61097 Epoch 6/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5901 - acc: 0.0000e+00 - val_loss: 0.6142 - val_acc: 0.0000e+00

Epoch 00006: val_loss did not improve from 0.61097 Epoch 7/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5887 - acc: 0.0000e+00 - val_loss: 0.6051 - val_acc: 0.0000e+00

Epoch 00007: val_loss improved from 0.61097 to 0.60511, saving model to model.h5 Epoch 8/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5881 - acc: 0.0000e+00 - val_loss: 0.5992 - val_acc: 0.0000e+00

Epoch 00008: val_loss improved from 0.60511 to 0.59922, saving model to model.h5 Epoch 9/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5879 - acc: 0.0000e+00 - val_loss: 0.6010 - val_acc: 0.0000e+00

Epoch 00009: val_loss did not improve from 0.59922 Epoch 10/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5879 - acc: 0.0000e+00 - val_loss: 0.5975 - val_acc: 0.0000e+00

Epoch 00010: val_loss improved from 0.59922 to 0.59746, saving model to model.h5 Epoch 11/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5872 - acc: 0.0000e+00 - val_loss: 0.5953 - val_acc: 0.0000e+00

Epoch 00011: val_loss improved from 0.59746 to 0.59533, saving model to model.h5 Epoch 12/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5872 - acc: 0.0000e+00 - val_loss: 0.5948 - val_acc: 0.0000e+00

Epoch 00012: val_loss improved from 0.59533 to 0.59482, saving model to model.h5 Epoch 13/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5867 - acc: 0.0000e+00 - val_loss: 0.5957 - val_acc: 0.0000e+00

Epoch 00013: val_loss did not improve from 0.59482 Epoch 14/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5860 - acc: 0.0000e+00 - val_loss: 0.5964 - val_acc: 0.0000e+00

Epoch 00014: val_loss did not improve from 0.59482 Epoch 15/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5864 - acc: 0.0000e+00 - val_loss: 0.5919 - val_acc: 0.0000e+00

Epoch 00015: val_loss improved from 0.59482 to 0.59195, saving model to model.h5 Epoch 16/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5857 - acc: 0.0000e+00 - val_loss: 0.5920 - val_acc: 0.0000e+00

Epoch 00016: val_loss did not improve from 0.59195 Epoch 17/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5862 - acc: 0.0000e+00 - val_loss: 0.5984 - val_acc: 0.0000e+00

Epoch 00017: val_loss did not improve from 0.59195 Epoch 18/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5851 - acc: 0.0000e+00 - val_loss: 0.5921 - val_acc: 0.0000e+00

Epoch 00018: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.

Epoch 00018: val_loss did not improve from 0.59195 Epoch 19/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5850 - acc: 0.0000e+00 - val_loss: 0.5915 - val_acc: 0.0000e+00

Epoch 00019: val_loss improved from 0.59195 to 0.59152, saving model to model.h5 Epoch 20/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5847 - acc: 0.0000e+00 - val_loss: 0.5904 - val_acc: 0.0000e+00

Epoch 00020: val_loss improved from 0.59152 to 0.59040, saving model to model.h5 Epoch 21/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5846 - acc: 0.0000e+00 - val_loss: 0.5897 - val_acc: 0.0000e+00

Epoch 00021: val_loss improved from 0.59040 to 0.58970, saving model to model.h5 Epoch 22/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5845 - acc: 0.0000e+00 - val_loss: 0.5895 - val_acc: 0.0000e+00

Epoch 00022: val_loss improved from 0.58970 to 0.58949, saving model to model.h5 Epoch 23/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5846 - acc: 0.0000e+00 - val_loss: 0.5889 - val_acc: 0.0000e+00

Epoch 00023: val_loss improved from 0.58949 to 0.58886, saving model to model.h5 Epoch 24/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5845 - acc: 0.0000e+00 - val_loss: 0.5879 - val_acc: 0.0000e+00

Epoch 00024: val_loss improved from 0.58886 to 0.58790, saving model to model.h5 Epoch 25/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5846 - acc: 0.0000e+00 - val_loss: 0.5885 - val_acc: 0.0000e+00

Epoch 00025: val_loss did not improve from 0.58790 Epoch 26/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5845 - acc: 0.0000e+00 - val_loss: 0.5892 - val_acc: 0.0000e+00

Epoch 00026: val_loss did not improve from 0.58790 Epoch 27/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5844 - acc: 0.0000e+00 - val_loss: 0.5894 - val_acc: 0.0000e+00

Epoch 00027: ReduceLROnPlateau reducing learning rate to 1e-05.

Epoch 00027: val_loss did not improve from 0.58790 Epoch 28/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5843 - acc: 0.0000e+00 - val_loss: 0.5893 - val_acc: 0.0000e+00

Epoch 00028: val_loss did not improve from 0.58790 Epoch 29/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5843 - acc: 0.0000e+00 - val_loss: 0.5891 - val_acc: 0.0000e+00

Epoch 00029: val_loss did not improve from 0.58790 Epoch 30/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5844 - acc: 0.0000e+00 - val_loss: 0.5896 - val_acc: 0.0000e+00

Epoch 00030: val_loss did not improve from 0.58790 Epoch 31/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5842 - acc: 0.0000e+00 - val_loss: 0.5888 - val_acc: 0.0000e+00

Epoch 00031: val_loss did not improve from 0.58790 Epoch 32/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5841 - acc: 0.0000e+00 - val_loss: 0.5896 - val_acc: 0.0000e+00

Epoch 00032: val_loss did not improve from 0.58790 Epoch 33/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5840 - acc: 0.0000e+00 - val_loss: 0.5891 - val_acc: 0.0000e+00

Epoch 00033: val_loss did not improve from 0.58790 Epoch 34/100 9478/9478 [==============================] - 64s 7ms/step - loss: 0.5842 - acc: 0.0000e+00 - val_loss: 0.5893 - val_acc: 0.0000e+00

Epoch 00034: val_loss did not improve from 0.58790 Epoch 00034: early stopping

shivamsaboo17 commented 5 years ago

Hi, @sankalpmittal1911-BitSian you might want to look into the loss function you are using for the task. From what I am guessing, you are considering this as regression problem and using mean squared error (correct me if I am wrong), while in fact this is not a regression problem, rather classification problem. You have to do classification(segmentation) at pixel level (think of predicting probabilities for all classes at each pixel location and selecting one with maximum probability at all pixel location). Usually for segmentation, CrossEntropyLoss is used and you might want to try it out. Let me know if you have any doubts regarding implementation.

Best, Shivam

sankalpmittal1911-BitSian commented 5 years ago

Hi, @shivamsaboo17

In the end we are generating masked images which are also images. Input is (128,128,1) and output is also (128,128,1). Yes the problem in itself is classification, I get it, since we are reading the output cdl images as a table: https://github.com/ZihengSun/Ag-Net-Dataset/blob/master/cdlvalue.csv

But the output is still an image right? I have not used MSE but have instead used Binary Cross Entropy Loss only as my last activation function is sigmoid on each pixel of output image. You are able to access code right?

Thanks and Best! Sankalp Mittal

shivamsaboo17 commented 5 years ago

Hi @sankalpmittal1911-BitSian thanks for replying. I understand your intuition of using sigmoid on each pixel to create a image and then use BCELoss. However, the output is more accurately a segmentation map rather than an image. We use BinaryCrossEntropyLoss when we have only 2 classes to predict (say background and foreground in image). In this case there are more than 2 classes to choose from (hint: load all target images and create a dictionary of unique values). Let's say we have to classify into n classes, so the output which you should predict would be of shape (128, 128, n) and then you will select most probable class (i.e argmax) after using softmax over these classes. So actually, the actual values are not important (as in case of image). You can assign any color to any class and replace the class which you predicted with those values so you can distinguish during visualization. See an example belo for multi-class classification. You would realize that our aim is to classify different objects and the intensity value which you use to visualize are not important.

Let me know if you have any doubts.

Best, Shivam

sankalpmittal1911-BitSian commented 5 years ago

Thanks a lot for the help! I will update the sigmoid with the softmax layer and use the categorical cross entropy loss as loss function and train the U-Net again. From the dataset, there are 255 classes.

Will post those results soon.

ZihengSun commented 5 years ago

Sir, do you mean we may consider all channels at once, average them and pass the newly created dataset of images as training data? This is because #input images = 7*#output images. This may benefit the network I think. Will experiment on this and tell you the result in 3 days. Thank you.

@sankalpmittal1911-BitSian Averaging is not a recommended practice. As the band values are surface reflectance in different spectral regions, some bands like band 5 have very large numbers, and the others are small. If using average, the variation pattern in the small-number bands will be hard to learn.

Judging from the training log which keep saying the validation accu is zero and not improving after only 34 epochs, there seem to be something not right in the output layer and loss function. Multi class cross entropy is the common choice for image segmentation of more than two classes. Make sure the output layer has shape like (batch, width, height, classnum).

sankalpmittal1911-BitSian commented 5 years ago

@ZihengSun Sir,

So from what I gather, every pixel in the output layer will be like a array of 255 length (i.e. number of classes). [0.5 0.4 ....]. This becomes in Keras (batch,128,128,255). Now we need to apply softmax to every pixel of image over the number of classes, that is along axis = 2. Then we need to take the maximum of each pixel along the number of classes so the final image is again (batch,128,128,1) which is then compared with output .tif file. Please correct me if I am wrong. Thank You. Also, every output label image is the segmentation of only one of the 255 crops right?

1998at commented 5 years ago

@ZihengSun I implemented a toy UNet model just to test whether it works.Here are the Results.I didnt take any validation set(I understand its a bad practice but just wanted to get started),Also There are no Image Augmentations Used.Here is the log after training for 10 epochs Epoch 0===> trn_loss=1.4780788052082061 trn_acc=0.5332283782958984 Epoch 1===> trn_loss=1.4878936338424682 trn_acc=0.5299695205688476 Epoch 2===> trn_loss=1.4736434078216554 trn_acc=0.5325695419311524 Epoch 3===> trn_loss=1.466643956899643 trn_acc=0.5359280395507813 Epoch 4===> trn_loss=1.4540286827087403 trn_acc=0.5391661834716797 Epoch 5===> trn_loss=1.4555853509902954 trn_acc=0.5378150558471679 Epoch 6===> trn_loss=1.4375684547424317 trn_acc=0.5455305099487304 Epoch 7===> trn_loss=1.4241751718521118 trn_acc=0.5499917984008789 Epoch 8===> trn_loss=1.427647979259491 trn_acc=0.5488113784790039 Epoch 9===> trn_loss=1.4385879373550414 trn_acc=0.5430210113525391

I used Categorical Cross Entropy as Loss Function and for calculating accuracy I took out the final predictions from model,128 128 in this case,Compared them with Base Labels and did a summation of all the times it correctly predicted acc=some_num fin_acc=acc/((128 128) batch_size train_loader_length) I will be uploading the file soon with validation and some data Augmentations tried out and training for a little longer time Also is there any baseline accuracy that we can compare our model to

rishi-s8 commented 5 years ago

@ZihengSun I used a modified the U-Net by adding Batch Normalizations and used sparse_crossentropy as the loss, after applying softmax as the last layer. Also I normalized the pixel values of the band images. Considering the labels are between 1-247, I plotted the loss graph. It has been attached. As you can see, the loss values aren't very low after 50 epochs. Any suggestions? plot

sinAshish commented 5 years ago

@ZihengSun I trained a UNet model with resnet as encoder for 10 epochs, and get a dice score of around 0.48. I used all the 7 channels at once. But I need some help in loading the mask to appropriate no. of classes

sankalpmittal1911-BitSian commented 5 years ago

Hi @sinAshish,

By dice score do you mean IoU or something similar? So did you use that as a custom metric instead of just accuracy. I will update my model based on suggestions here. I have this one problem:

So I have this output image of the form (batch,128,128,255). How will I apply softmax to this keras API layer along the axis such that it is applied over number of classes. And how we will select the maximum pixel as probability since maximum can belong to different classes?

u9 = Conv2DTranspose(n_filters*1, (3, 3), strides=(2, 2), padding='same') (c8)
    u9 = concatenate([u9, c1], axis=3)
    u9 = Dropout(dropout)(u9)
    c9 = conv2d_block(u9, n_filters=n_filters*1, kernel_size=3, batchnorm=batchnorm)
    #c9 = conv2d_block(u9, n_filters=255, kernel_size=3, batchnorm=batchnorm)

    outputs = Conv2D(255, (1, 1)) (c9)
    #outputs = core.Reshape((128,128 ,  255))(outputs)
    #conv6 = core.Permute((2,1))(outputs)

    outputs = core.Activation('softmax')(outputs)
    #outputs = softmax(outputs, axis = 3)(outputs)

If I need to change the labeled output images in form of (128,128,255) as well, how can we modify our masks then?

Thank you all

ZihengSun commented 5 years ago

@ZihengSun Sir,

So from what I gather, every pixel in the output layer will be like a array of 255 length (i.e. number of classes). [0.5 0.4 ....]. This becomes in Keras (batch,128,128,255). Now we need to apply softmax to every pixel of image over the number of classes, that is along axis = 2. Then we need to take the maximum of each pixel along the number of classes so the final image is again (batch,128,128,1) which is then compared with output .tif file. Please correct me if I am wrong. Thank You. Also, every output label image is the segmentation of only one of the 255 crops right?

@sankalpmittal1911-BitSian That is correct. The final predicted class is the one with the highest probability.

ZihengSun commented 5 years ago

@ZihengSun I implemented a toy UNet model just to test whether it works.Here are the Results.I didnt take any validation set(I understand its a bad practice but just wanted to get started),Also There are no Image Augmentations Used.Here is the log after training for 10 epochs Epoch 0===> trn_loss=1.4780788052082061 trn_acc=0.5332283782958984 Epoch 1===> trn_loss=1.4878936338424682 trn_acc=0.5299695205688476 Epoch 2===> trn_loss=1.4736434078216554 trn_acc=0.5325695419311524 Epoch 3===> trn_loss=1.466643956899643 trn_acc=0.5359280395507813 Epoch 4===> trn_loss=1.4540286827087403 trn_acc=0.5391661834716797 Epoch 5===> trn_loss=1.4555853509902954 trn_acc=0.5378150558471679 Epoch 6===> trn_loss=1.4375684547424317 trn_acc=0.5455305099487304 Epoch 7===> trn_loss=1.4241751718521118 trn_acc=0.5499917984008789 Epoch 8===> trn_loss=1.427647979259491 trn_acc=0.5488113784790039 Epoch 9===> trn_loss=1.4385879373550414 trn_acc=0.5430210113525391

I used Categorical Cross Entropy as Loss Function and for calculating accuracy I took out the final predictions from model,128 128 in this case,Compared them with Base Labels and did a summation of all the times it correctly predicted acc=some_num fin_acc=acc/((128 128) batch_size train_loader_length) I will be uploading the file soon with validation and some data Augmentations tried out and training for a little longer time Also is there any baseline accuracy that we can compare our model to

@at1998 I would say give it another 100 epochs and see if the training accuracy rise significantly? If not, it might have something in the output layer not matching the CDL training labels.

ZihengSun commented 5 years ago

@ZihengSun I used a modified the U-Net by adding Batch Normalizations and used sparse_crossentropy as the loss, after applying softmax as the last layer. Also I normalized the pixel values of the band images. Considering the labels are between 1-247, I plotted the loss graph. It has been attached. As you can see, the loss values aren't very low after 50 epochs. Any suggestions? plot

@rishi-s8 The curves look reasonable. As it is still on the big slope down trend, the loss should be expected to decrease more if the training continues.

ZihengSun commented 5 years ago

@ZihengSun I trained a UNet model with resnet as encoder for 10 epochs, and get a dice score of around 0.48. I used all the 7 channels at once. But I need some help in loading the mask to appropriate no. of classes

Sorry , I am not very clear about the question. Do you want to shrink the number of classes by removing those empty classes?