TexasInstruments / edgeai-modelmaker

This repository has been moved. The new location is in https://github.com/TexasInstruments/edgeai-tensorlab
https://github.com/TexasInstruments/edgeai
Other
1 stars 0 forks source link

EdgeAI-ModelMaker

Notice

If you have not visited the following landing pages, please do so before attempting to use this repository.


Release Notes


Introduction

EdgeAI-ModelMaker is an end-to-end model development tool that contains dataset handling, model training and compilation. Currently, it doesn't have an integrated feature to annotate data, but can accept annotated Dataset from a tool such as Label Studio

We have published several repositories for model training, model compilation and modelzoo as explained in our edgeai gihub page. This repository is an attempt to stitch several of them together to make release_notes.mda simple and consistent interface for model development. This does not support all the models that can be trained and compiled using our tools, but only a subset. This is a commandline tool and requires a Linux PC.

The following are the key operations supported by this tool:

These functionalities that are supported are fully integrated and the user can control it by setting parameters in the config file.

Task Types

Model Types

For Object Detection, we use YOLOX models. For Image Classification we have support for MobileNetV2 and RegNetX. For Semantic Segmentation we have support DeepLabV3Plus, FPN and UNet models. For Keypoint Detection we use the YOLO-pose method.

SoCs supported

These are devices with Analytics Accelerators (DSP and Matrix Multiplier Acceletator) along with ARM cores.

These are non-accelerated devices (model inference runs on ARM cores) supported.

The details of these SoCs are here

Setup Instructions

Step 1: OS & Environment

This repository can be used from native Ubuntu bash terminal directly or from within a docker environment.

Step 1, Option 1: With native Ubuntu environment and pyenv (recommended)

We have tested this tool in Ubuntu 22.04 and with Python 3.10 (Note: From 9.0 release onwards edgeai-tidl-tools supports only Python 3.10).

In this option, we describe using this repository with the pyenv environment manager.

Step 1.1a: Make sure that you are using bash shell. If it is not bash shell, change it to bash. Verify it by typing:

echo ${SHELL}

Step 1.2a: Install system dependencies

sudo apt update
sudo apt install build-essential curl libbz2-dev libffi-dev liblzma-dev libncursesw5-dev libreadline-dev libsqlite3-dev libssl-dev libxml2-dev libxmlsec1-dev llvm make tk-dev xz-utils wget curl
sudo apt install -y libffi-dev libjpeg-dev zlib1g-dev graphviz graphviz-dev protobuf-compiler

Step 1.3a: Install pyenv using the following commands

curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash

echo '# pyenv settings ' >> ${HOME}/.bashrc
echo 'command -v pyenv >/dev/null || export PATH=":${HOME}/.pyenv/bin:$PATH"' >> ${HOME}/.bashrc
echo 'eval "$(pyenv init -)"' >> ${HOME}/.bashrc
echo 'eval "$(pyenv virtualenv-init -)"' >> ${HOME}/.bashrc
echo '' >> ${HOME}/.bashrc

exec ${SHELL}

Further details on pyenv installation are given here https://github.com/pyenv/pyenv and https://github.com/pyenv/pyenv-installer

Step 1.4a: Install Python 3.10 in pyenv and create an environment

pyenv install 3.10
pyenv virtualenv 3.10 py310
pyenv rehash
pyenv activate py310
pip install --upgrade pip setuptools

Step 1.5a: Activate the Python environment. This activation step needs to be done everytime one starts a new terminal or shell. (Alternately, this also can be written to the .bashrc, so that this will be the default penv environment).

pyenv activate py310

Step 1, Option 2: With docker environment

Step 1.1b: Install docker if you don't have it already. The following steps are for installation on Ubuntu 18.04

./docker/docker_setup.sh

Step 1.2b: Build docker image:

./docker/docker_build.sh

Step 1.3b: Run docker container to bring up the container terminal on docker:

./docker/docker_run.sh

Source .bashrc to update the PATH

source /opt/.bashrc

Step 1.4b: During docker run, we map the parent directory of this folder to /opt/code This is to easily share code and data between the host and the docker container. Inside the docker terminal, change directory to where this folder is mapped to:

cd /opt/code/edgeai-modelmaker

Step 2: Setup the model training and compilation repositories

This tool depends on several repositories that we have published at https://github.com/TexasInstruments

The following setup script can take care of cloning the required repositories and running their setup scripts.

./setup_all.sh

If the script runs sucessfully, you should have this directory structure:

parent_directory
    |
    |--edgeai-modelzoo
    |--edgeai-torchvision
    |--edgeai-mmdetection
    |--edgeai-benchmark
    |--edgeai-modelmaker

Your python environment will have several model compilation python packages from edgeai-tidl-tools installed. See it by running:

pip list | grep 'onnxruntime\|tflite\|tvm\|dlr\|osrt'

Also, PyTorch and its related packages will be installed (This torchvision package is installed from our fork called edgeai-torchvision). See it by running:

pip list | grep 'torch\|torchvision'

Enabling optional components

In setup_all.sh, there are flags to enable additional models:

PLUGINS_ENABLE_EXTRA: Setting this to 1 during setup enables additional models.

Step 3: Run the ready-made examples

./run_modelmaker.sh <target_device> <config_file>

Examples:

Object detection example

./run_modelmaker.sh TDA4VM config_detection.yaml

Image classification example

./run_modelmaker.sh TDA4VM config_classification.yaml

Where TDA4VM above is an example of target_device supported.

Target devices supported

The list of target devices supported depends on the tidl-tools installed by edgeai-benchmark. Currently TDA4VM, AM68A, AM62A and AM69A are supported.

Step 4: Prepare your own dataset with your own images and object types (Data annotation)

Step 4.1: Install LabelStudio

Step 4.2: Run LabelStudio

Step 4.3: How to use Label Studio for data annotation

Step 5: Dataset format

Object Detection dataset format

An object detection dataset should have the following structure.

data/datasets/dataset_name
                             |
                             |--images
                             |     |-- the image files should be here
                             |
                             |--annotations
                                   |--instances.json

Image Classification dataset format

An image classification dataset should have the following structure. (Use a suitable dataset name instead of dataset_name).

data/datasets/dataset_name
                             |
                             |--images
                             |     |-- the image files should be here
                             |
                             |--annotations
                                   |--instances.json

Semantic Segmentation dataset format

An object detection dataset should have the following structure.

data/datasets/dataset_name
                             |
                             |--images
                             |     |-- the image files should be here
                             |
                             |--annotations
                                   |--instances.json

Notes

If the dataset has already been split into train and validation set already, it is possible to provide those paths separately as a tuple in input_data_path.

After the model compilation, the compiled models will be available in a folder inside ./data/projects

If you have a dataset in another format, use the script provided to convert it into the COCO jSON format. See the examples given in run_convert_dataset.sh for example conversions.

The config file can be in .yaml or in .json format

Step 6: Accelerated Training using GPUs (Optional)

Note: This section is for advanced users only. Familiarity with NVIDIA GPU and CUDA driver installation is assumed.

This tool can train models either on CPU or on GPUs. By default, CPU based training is used.

It is possible to speedup model training significantly using GPUs (with CUDA support) - if you have those GPUs in the PC. The PyTorch version that we install by default is not capable of supporting CUDA GPUs. There are additional steps to be followed to enable GPU support in training.

Option 1: When using Native Ubuntu Environment

The user has to install an appropriate NVIDIA GPU driver that supports the GPU being used.

The user also has to install CUDA Toolkit. See the CUDA download instructions. The CUDA version that is installed must match the CUDA version used in the PyTorch installer - see our edgeai-torchvision setup script to understand the CUDA version used.

Option 2: When using docker environment

Enabling CUDA GPU support inside a docker environment requires several additional steps. Please follow the instructions given in: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

Once CUDA is installed, you will be able to model training much faster.

Step 7: Model deployment

The compiled model has all the side information required to run the model on our Edge AI StarterKit EVM and SDK.