heaversm / deeplab-training

Training your own Deeplab Model in Tensorflow
37 stars 22 forks source link

Training Deeplab on Your Own Dataset

TLDR: This tutorial covers how to set up Deeplab within Tensorflow to train your own machine learning model, with a focus on separating humans from the background of a photograph in order to perform background replacement.

If you'd rather watch this on Youtube, see the deeplab training tutorial here, and the openCV visualization / background swapping tutorial here

There are 3 parts to the tutorial. Feel free to skip to the section that is most relevant to you.


Installation Process

Create a Python3 Environment with Pip

/Users/[username]/.anaconda/navigator/a.tool ; exit;

Clone the Deeplab Models Github Repo

Clone the official tensorflow models repo

You will only need the models/research/deeplab and models/research/slim directories. You can delete everything else.

Merge the files from the tutorial repo into the tensorflow models repo

Clone or download this repo, and put everything into the directory you just created for the tensorflow models repo. but don't overwrite anything *except the input_preprocess.py file in the /deeplab/ directory, which has a small change.

For example put models/research/eval-pqr.sh into the tensorflow models/research directory.

Install Tensorflow

pip3 install --upgrade pip #need version 19 or higher
pip3 install tensorflow==1.15 #I had issues with tensorflow 2 on a mac

If you have a CUDA-compatible GPU, You can use tensorflow-gpu instead of tensorflow.

pip3 install Pillow #use this for a mac. Other systems or versions of python might use PIL
pip3 install tqdm numpy

more help on installing tensorflow here.

Make sure to follow the steps in the link to ensure that you can run model_test.py:

python3 deeplab/model_test.py

Pay special attention to this step:

# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

This command has to be run each time you activate your python environment or open the terminal window:

And also, make sure, especially if you are running multiple python environments, that you always use python3 and pip3 for every command you run (instead of python and pip). This will save you lots of headaches.

Image Preparation Process

Notes

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

for each terminal session

Dependencies

Python V3.7.4

Tensorflow 1.15

Making a dataset

You will need a consistent background image, and a large set of transparent (or masked) foreground images with photos of people. You'll want to composite each foreground image on to the background.

Make sure the background image is representative of the background image you will be using for real time photo replacement.

Make sure the foreground images represent the diversity of photos you will likely expect in a live scenario. For best results, consider things like:

Scraping Images

The utilities/scrapeImages.py file is useful in downloading images from google.

NOTE: this search does not limit search results to freely licensed files - it was only used for my internal testing, and you should be careful not to utilize any scraped images from any website without ensuring that you are adhering to their licensing and use guidelines.

You should first edit the scrapeImages.py file to use your desired query string. Look for:

url="https://www.google.co.in/search?q="+query+"&source=lnms&tbm=isch&tbs=isz:m,itp:photo,ic:trans,ift:png"

The tbs= param in this case does the following:

To use your own parameters, do an advanced google search for the type of images you want, and take a look at the query string in the URL bar of your browser for what tbs parameters it generates for you, and replace them here.

You then run the scraper as follows:

python3 utilities/scrapeImages.py --search "[your_search_term]" --num_images 100 --directory "/[Path]/[to]/[your]/[image]/[folder]"

Changing [your_search_term] and the value of the --directory flag to where you want to save images to.

Creating Segmentation Images

You will need to create a new set of images that merges each transparent foreground images on to the consistent background.

You will also need to create a new set of images where the background is black, and the transparent foreground image matches the color you are trying to segment, in this case "Person" which is color rgb(192,128,128).

Both sets of images, the "regular" and "segmentation" images should have the same size, and match each other exactly in terms of the position and scale of the foreground subjects in relation to the background. See this example:

Regular Image

athletes-001

Segmentation Image

athletes-001

The photoshop actions section below has a set of useful actions for accomplishing this properly in photoshop.

You should make a directory within the models/deeplab/datasets directory. Call it whatever you like (in this case, we used PQR).

Within that folder, make another folder called JPEGImages and place all the "regular" images.

Photoshop Actions

If you know how to use photoshop actions, this repo contains a set of actions that will help convert and merge your photos. Go to window > actions in photoshop and choose load actions and load the glowbox.atn file.

To run these actions in batch, you'll want to go to File > Automate > Batch in photoshop, and select the desired action, the folder location of your foreground images. Destination should be None as the action contains a save command itself.

To edit any of the actions, you'll want to select the step of the action from the actions panel and double click it, modifying the desired parameters.

This action:

Make sure to edit the action to specify the location of your background image, the canvas size matching your background image's size, and the desired export quality. Also, make sure the export location does not overwrite your transparent foreground images, you'll need those to create your segmentation masks.

This action:

Make sure to edit the action to specify the desired color segmentation for your images, if you are not trying to identify people in your photos. You can see the deeplab (resnet) color segmentation scheme here.

However, if you already have images, this action just ensures that the color for the segmentation mask is exact, forcing a pink-ish color to the exact pixel values. Photoshop, for example, does some adjustment of colors on a normal save to match your screen's color profile. You can prevent having to run this action at all if saving from photoshop by ensuring that the convert to sRGB option in the save for web dialog is unchecked.

You end up with 3 layers - one with the untouched photo, one with the segmentation mask, and one with your regular image masked off to show how well the background was removed and the subjects were isolated.

Convert your RGB segmentation images to indexed colors

In order to reduce the number of dimensions of processing deeplab has to do on each image, we will be converting each found RGB color in the segmentation images you made (i.e. RGB(192,128,128)) to an indexed color value (i.e. 1). This will make processing a lot faster.

This repo includes a file in the deeplab/datasets/ directory called convert_rgb_to_index.py which will help you accomplish that.

Before running, make sure to edit the following:

# palette (color map) describes the (R, G, B): Label pair
palette = {(0,   0,   0) : 0 , #background
   (192,  128, 128) : 1 #person
   }

If you are not processing people, the palette should contain all of the segmentation colors you are trying to detect. In our case, since we are just looking for people, the palette contains black for the background as index 0, and pink for the foreground as index 1.

label_dir: this is the path (relative to the datasets directory where this file is contained) where your Segmentation Class images were saved. Make sure to change it if your file locations differ.

new_label_dir: this is the path where your newly generated images will be saved. You do not need to make this directory, it will be generated for you.

To run the script, from the datasets directory, run: python3 convert_rgb_to_index.py. You will need to make sure all of this files dependencies are installed via pip:

Once it runs, you should have a new folder SegmentationClassRaw (or whatever you called the new_label_dir folder). It should contain a list of .png images. They will all look black. This is normal. We converted the RGB values into single index values, so a standard image viewer won't understand this format.

Make a list of all your training and test images

Make another folder at the same level as JPEGImages called SegmentationClass (see the folder structure section below for the a better sense of the entire folder structure you will be adding to deeplab). This folder will contain all your segmentation images.

Deciding how to divide up your train and validation set is up to you. Ideally you have at least 500 training images, and at least 100 test images. A good starting split might be a 10:1 ratio of training to test images.

Generate the tfrecord folder

Tensorflow has a tfrecord format that makes storing training data much more efficient. We will need to generate this folder for our dataset. To do so, this repo has made a copy of the build_voc2012_data.py file which has been saved as a new file, (in our case build_pqr_data.py).

Edit the build_pqr_data.py file, and make sure there is a flag for our model's desired folders. In this case, look at ~line80:

tf.app.flags.DEFINE_string('image_folder',
                     './PQR/JPEGImages',
                     'Folder containing images.')

tf.app.flags.DEFINE_string(
'semantic_segmentation_folder',
'./PQR/SegmentationClassRaw',
'Folder containing semantic segmentation annotations.')

tf.app.flags.DEFINE_string(
'list_folder',
'./PQR/ImageSets',
'Folder containing lists for training and validation')

tf.app.flags.DEFINE_string(
'output_dir',
'./PQR/tfrecord',
'Path to save converted SSTable of TensorFlow examples.')

Make sure to change any of those directories to match where your files are located. In this instance, the tfrecord folder should exist. The script will not make it for you. Also note that at around Line 119 I have hardcoded the input format to be `.jpg:

image_filename = os.path.join(
#MH:
#FLAGS.image_folder, filenames[i] + '.' + FLAGS.image_format)
FLAGS.image_folder, filenames[i] + '.jpg')
#END MH

and the output images to be .png

#MH:
      #filenames[i] + '.' + FLAGS.label_format)
      filenames[i] + '.png')
      #END MH

due to an issue I had with the script utilizing the label_format flag. You should change those extensions to match the extensions of your own images if they differ.

Now you can run the file (from the datasets directory:

python3 build_pqr_data.py

Once this is done, you will have a tfrecord directory filled with .tfrecord files.

Add the information about your dataset segmentation (TODO: check to make sure we still need this step...)

You'll need to provide tensorflow the list of how your dataset was divided up into training and test images.

In deprecated/segmentation_dataset.py , look for the following (~Line 114):

# MH
_PQR_INFORMATION = DatasetDescriptor(
splits_to_sizes={
  'train': 487,
  'val': 101,
  'trainval': 588,
},
num_classes=2,
ignore_label=255,
)

_DATASETS_INFORMATION = {
'cityscapes': _CITYSCAPES_INFORMATION,
'pascal_voc_seg': _PASCAL_VOC_SEG_INFORMATION,
'ade20k': _ADE20K_INFORMATION,
'pqr': _PQR_INFORMATION,
}
# END MH

These splits should match the number of files in your training and test sets that you made earlier. For example, if train.txt has 487 line numbers, train is 487. Same with val and trainval. If you are trying to segment more than just the background and foreground, num_classes should match the number of segmentations you are targeting. ignore_label=255 just means you are ignoring anything in the segmentation that is white (used in some segmentations to create a clear space division between multiple segmentations).

Note that _DATASETS_INFORMATION also contains a reference to this new dataset descriptor we've added:

'pqr': _PQR_INFORMATION

You're finally ready to train!

Training Process

Folder Structure

Make sure your folder structure from /datasets looks similar to this, if you followed all of the naming conventions in the above steps:

+ PQR
  + exp //contains exported files
  + train_on_trainval_set
  + eval //contains results of training evaluation
  + init_models //contains the deeplab pascal training set, which you need to download
  + train //contains training ckpt files
  + vis
    + segmentation_results //contains the generated segmentation masks
  + Imagesets
    train.txt
    trainval.txt
    val.txt
  + logs
  + tfrecord //holds your converted dataset
buid_pqr_data.py //creates your tfrecord files
convert_rgb_to_index.py //turns rgb images into their segmentation indices

../../train-pqr.sh //holds the training script
../../eval-pqr.sh //holds the eval script
../../vis-pqr.sh //holds the visualization script

Download the Pascal Training Set

In order to make our training much faster we'll want to use a pre-trained model, in this case pascal VOC2012. You can download it here. Extract it into the PQR/exp/train_on_tranval_set/init_models directory (should be named deeplabv3_pascal_train_aug).

Edit your training script

First, edit your train-pqr.sh script (in the models/research) directory:

# Set up the working environment.
CURRENT_DIR=$(pwd)
WORK_DIR="${CURRENT_DIR}/deeplab"
DATASET_DIR="datasets"

# Set up the working directories.
PQR_FOLDER="PQR"
EXP_FOLDER="exp/train_on_trainval_set"
INIT_FOLDER="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/${EXP_FOLDER}/init_models"
TRAIN_LOGDIR="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/${EXP_FOLDER}/train"
DATASET="${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/tfrecord"

mkdir -p "${WORK_DIR}/${DATASET_DIR}/${PQR_FOLDER}/exp"
mkdir -p "${TRAIN_LOGDIR}"

NUM_ITERATIONS=9000
python3 "${WORK_DIR}"/train.py \
--logtostderr \
--train_split="train" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size=1000,667 \
--train_batch_size=4 \
--training_number_of_steps="${NUM_ITERATIONS}" \
--fine_tune_batch_norm=true \
--tf_initial_checkpoint="${INIT_FOLDER}/deeplabv3_pascal_train_aug/model.ckpt" \
--train_logdir="${TRAIN_LOGDIR}" \
--dataset_dir="${DATASET}"

Things you may want to change:

Start training:

You are finally ready to start training!

From the models/research directory, run sh train-pqr.sh

If you've set everything up properly, your machine should start training! This will take.a.long.time. You should be seeing something like this in your terminal:

training

Evaluation

Running eval-pqr.sh from the same directory will calculate the mean intersection over union score for your model. Essentially, this will tell you the number of pixels in common between the actual mask and the prediction of your model:

iou

In my case, I got a score of ~.87 - which means essentially 87% of the pixels in my prediction mask were found in my target mask. The higher the number here, the better the mask.

Visualization

To visualize the actual output of your masks, run vis-pqr.sh from the models/research directory. These will output to your visualization directory you specified (in our case, models/research/deeplab/datasets/PQR/exp/train_on_trainval_set/vis/segmentation_results). You will see two separate images for each visualization: the "regular" image, and the "prediction" (or segmentation mask).

If you want to combine these two images, the merge_segmentation photoshop action can help.

I've also set this up as an automated process in openCV to take an image and its segmentation mask and automatically substitute in a background of your choosing.

Using OpenCV for background replacement

Install OpenCV

Follow these directions to install opencv on mac - but use version 4.1.2 instead of 4.0:

wget -O opencv.zip https://github.com/opencv/opencv/archive/4.1.2.zip
$ wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/4.1.2.zip

Give your virtual environment a name of cv, then workon cv.

Rename/usr/local/lib/python3.7/site-packages/cv2/python-3.7/cv2.cpython-37m-darwin.so to cv2.so

then cd /Users/[your_username]/.virtualenvs/cv/lib/python3.7/site-packages

then ln -s /usr/local/lib/python3.7/site-packages/cv2/python-3.7/cv2.so cv2.so

The cv Python virtual environment is entirely independent and sequestered from the default Python version on your system. Any Python packages in the global directories will not be available to the cv virtual environment. Similarly, any Python packages installed in site-packages of cv will not be available to the global install of Python.

Directory Structure

Navigate to the cv directory. You should have the following directory structure:

+input 
+output
+masks
+bg
replacebg_dd.py

Note: all files in the input and masks directories should have the same names to ensure they match up together when running the script

Using the replacebg.py script:

Before calling the script, check the following lines within the script:

input_dir = 'input/'
output_dir = 'output/'
mask_dir = 'masks/'
bg_dir = 'bg/'
bg_file = 'track.jpg'

These directories should match your directories relative to the replacebg.py script.

initial_threshold_val = 150 : Changing this value will change the black / white value above which the foreground is kept rather than the background.

Script Options

The python script is responsible for handling what pixels to keep from the source vs which to throw away, and can do some basic thresholding and blurring of the mask image to attempt to improve results.

There are a few parameters you can pass the replacebg.py script:

Keyboard commands

When you run the script and it is displaying an image, you can use the following keyboard commands:


NOTE:

This tutorial and repo were created through my difficulties installing and training deeplab, in the hopes that it would make things easier for others trying to do the same. Very little of the code is my own, and has been assembled from a variety of sources - all of which were extremely helpful, but none of which I was able to follow on their own in order to successfully train Deeplab. By combining various pieces of the following links, I was able to create a process that worked smoothly for me.

Links:

Analytics Vidhya - Semantic Segmentation: Introduction to the Deep Learning Technique Behind Google Pixel’s Camera!, Saurabh Pal

Installing Tensorflow - Official Documentation

Installing Deeplab - Official Documentation

Tensorflow-Deeplab-Resnet - Dr. Sleep

Free Code Camp - How to use DeepLab in TensorFlow for object segmentation using Deep Learning, Beeren Sahu

Dataset Utils - Gene Kogan - useful in scraping images for a dataset and creating randomly sized, scaled, and flipped images in order to increase the training set size.