mahendrabairagi / AWS_ML_At_Edge_With_NVIDIA_Jetson_Nano

46 stars 24 forks source link

AWS ML@Edge with NVIDIA Jetson Nano

In this project I will walkthrough how to create ML@Edge video analytics application. This will be end to end process, from data annotation, model building, training, optimization and then deploying model on edge device NVIDIA Jetson Nano.

Step 1: Data Annotation using Amazon SageMaker GroundTruth

Step 2: Model building, training and optimization using SageMaker notebooks, containers and Neo

Step 3: Deploy model on Jetson Nano using AWS IoT Greengrass

Step 4: Visualize and analyze video analytics from the model inference on Jetson Nano

Let's start with Step 1:

Step 1: Data Annotation using Amazon SageMaker GroundTruth

In this lab we will use Amazon SageMaker GroundTruth to label images in a training dataset consisting of Lego Dinosaurs images. You will start with an unlabeled image training data set, acquire labels for all the images using SageMaker Ground Truth private workforce and finally analyze the results of the labeling job.

High Level Steps:

  1. Upload training data into an S3 bucket.
  2. Create a private Ground Truth Labeling workforce.
  3. Create a Ground Truth Labeling job
  4. Label images using the Ground Truth Labeling portal.
  5. Analyze results

1. Upload training data into an S3 bucket.

In this step you will first create an Amazon S3 bucket where you will store the training data. You will then download the training data consisting Lego Dinosaurs images and then upload this dataset to the S3 bucket created.

1.1 Create an S3 bucket.

In this step you will create an Amazon S3 bucket where you will store the training data.

1.2 Download the training data.

In this step you will download the training data to your local machine. I have created Lego Dinosaurs dataset, it has about 388 files of 6 dinosaurus classes. Brachiosaurus, Dilophosaurus, Spinosaurus, Stegosaurus, Triceratops and Unknown dinosaurs.

1.3 Upload training data to the S3 bucket.

In this step you will upload the training data to the Amazon S3 bucket created in Step 1.1.

1.4 Create a private Ground Truth Labeling Workforce.

In this step, you will create a “private workteam” and add only one user (you) to it.

To create a private team:

That's it! This is your private worker's interface. Once the Ground Truth labeling job is submitted in the next step, you will see the annotation job in this portal.

1.5 Create a private Ground Truth Labeling Job.

In this step, you will create a Ground Truth Labeling job and assign it to the private workforce.

1.6 Label the images using the Ground Truth Labeling portal

In this step, you will complete a labeling/annotation job assigned to you from the Ground Truth Labeling portal.

Once the annotation job is assigned, you can view the job (similar to the picture below)

Note : After labeling a subset of images, the annotation job will be complete. If the first annotation job did not include all images, you will see a new job in the portal after a few minutes. Repeat the process of labeling images in the jobs as they appear in the portal, till all images are labelled. You can check the status of the labeling job from the Ground Truth Labeling Jobs, which will show you the number of images labeled out of the total images.

1.7. Analyze Results

In this step, you will review the manifest files created during the Ground Truth Labeling process. The manifest files are in the S3 bucket you created in Step 1.

Input Manifest File

Located in S3 bucket in the prefix : lego_dinosaurs_dataset/dataset-xxxxxx.manifest.

The manifest is a json file that captures information about the training data.

Sample :

{"source-ref":"s3://dino-dataset/lego_dinosaurs_dataset/3_Triceratops_084.jpg"} {"source-ref":"s3:/dino-dataset/lego_dinosaurs_dataset/5_NoDino_245.jpg"} {"source-ref":"s3://dino-dataset/lego_dinosaurs_dataset/0_Spinosaurus_111.jpg"} …

Output Manifest File

Located in S3 bucket in the prefix : /manifests/output.manifest

The manifest is a json file that captures metadata about each labeled image.

Sample:

{"source-ref": "s3://dino-dataset/3_Triceratops_084.jpg", "dino-image-classification": 3, "dino-image-classification-metadata": {"confidence": 0.94, "job-name": "labeling-job/dino-image-classification", "class-name": "3_Triceratops", "human-annotated": "yes", "creation-date": "2019-05-25T08:54:54.133410", "type": "groundtruth/image-classification"}} {"source-ref": "s3://dino-dataset/5_NoDino_245.jpg", "dino-image-classification": 5, "dino-image-classification-metadata": {"confidence": 0.95, "job-name": "labeling-job/dino-image-classification", "class-name": "5_Unknown", "human-annotated": "yes", "creation-date": "2019-05-25T08:37:55.495129", "type": "groundtruth/image-classification"}} {"source-ref": "s3://dino-dataset/0_Spinosaurus_111.jpg", "dino-image-classification": 0, "dino-image-classification-metadata": {"confidence": 0.68, "job-name": "labeling-job/dino-image-classification", "class-name": "0_Spinosaurus", "human-annotated": "yes", "creation-date": "2019-05-25T08:58:35.374405", "type": "groundtruth/image-classification"}} {"sourc ….

Along with the other metadata information, the output manifest shows the identified class of the image and confidence.

Now we need to build model, train model and optimize model.

Step 2: Model building, training and optimization using SageMaker notebooks, containers and Neo

Model building, training and optimization is simplified by SageMaker notebooks, training container and Neo. All these steps can be done using single notebook. Please follow attached notebook SageMaker notebook. Download this notebook and upload it to your SageMaker environment. To create SageMaker notebook environment, please follow this guide

One of the nice features of of jupyter notebook is that it can contains code as well as comments. I will use the notebook to explain model building, training and optimization.

Now that model is build and optimized, now we can deploy this model on NVIDIA Jetson Nano using AWS IoT Greengrass

Step 3: Deploy model on Jetson Nano using AWS IoT Greengrass

This step will need

3.1 Installing SageMaker Neo runtime

SageMaker Neo Runtime aka SageMaker Neo DLR is a runtime library that helps run models compiled using SageMaker Neo in the cloud. In our model training step, last step is to compile model using SageMaker Neo. In following steps we will install SageMaker Neo Runtime.

3.2 Installing AWS IoT Greengrass

First setup Setup your Jetson Nano Developer Kit with the SD card image.

Run the following commands on your Nano to create greengrass user and group:

$ sudo adduser --system ggc_user
$ sudo addgroup --system ggc_group

Setup your AWS account and Greengrass group using this page: https://docs.aws.amazon.com/greengrass/latest/developerguide/gg-config.html After downloading your unique security resource keys to your Jetson that were created in this step, proceed to step below. If you created and downloaded these keys on machine other than Jetson Nano then you will need to copy these to Jetson Nano. You can use SCP to transfer files from your desktop to Jetson Nano.

Download the AWS IoT Greengrass Core Software (v1.9.1) for ARMv8 (aarch64):

$ wget https://d1onfpft10uf5o.cloudfront.net/greengrass-core/downloads/1.9.1/greengrass-linux-aarch64-1.9.1.tar.gz

Following this page (starting with step #4 from that page), extract Greengrass core and your unique security keys on your Nano:

$ sudo tar -xzvf greengrass-linux-aarch64-1.9.1.tar.gz -C /
$ sudo tar -xzvf <hash>-setup.tar.gz -C /greengrass   # these are the security keys downloaded above

Download AWS ATS endpoint root certificate (CA):

$ cd /greengrass/certs/
$ sudo wget -O root.ca.pem https://www.amazontrust.com/repository/AmazonRootCA1.pem

Start greengrass core on your Nano:

$ cd /greengrass/ggc/core/
$ sudo ./greengrassd start

You should get a message in your terminal "Greengrass successfully started with PID: xxx"

3.3 Setup and configure Inference code using AWS Lambda

Go to AWS Management console and search for Lambda

Click 'Create function'

Choose 'Blueprints'

In the search bar, type “greengrass-hello-world” and hit Enter

Choose the python blueprint and click Configure

Name the function: e.g. interface-lambda Role: Choose an existing role [Note: You may need to create new role, give basic execution permissions, choose default)

Click Create Function Replace the default script with the inference script

3.4 Set machine leaning at edge deployment

3.5 Deploy machine learning at edge on NVIDIA Jetson Nano

3.6 Check inference

3.7 Troubleshooting

Step 4: Visualize and analyze video analytics from the model inference on Jetson Nano

The lambda code running on NVIDIA Jetson Nano device sends IoT messages back to cloud. These messages are sent to AWS CloudWatch. CloudWatch has built-in dashboard. We will use the built in dashboard to visualize data coming from the device.

Go to AWS Management console and search for Cloudwatch

Create a dashboard called “aws-nvidia-jetson-nano-dashboard-your-name”

Choose Line in the widget

Under Custom Namespaces, select “string”, “Metrics with no dimensions”, and then select all metrics.

Next, set “Auto-refresh” to the smallest interval possible (1h), and change the “Period” to whatever works best for you (1 second or 5 seconds)

You will see analysis on number of times different dinosaurs detected by NVIDIA Jetson Nano

NOTE: These metrics will only appear once they have been sent to Cloudwatch via the Lambda code running on edge. It may take some time for them to appear after your model is deployed and running locally. If they do not appear, then there is a problem somewhere in the pipeline.

With this we have come to the end of the session. As part of building this project, you learnt the following:

  1. How to create and annotate dataset for computer vision based model using Amazon SageMaker GroundTruth
  2. How to build, train and optimize model in Amazon SageMaker
  3. Setup and configure AWS IoT Greengrass
  4. Deploy the inference lambda function and model on NVIDIA Jetson Nano
  5. Analyze model inference data using AWS CloudWatch