Region | AMI ID |
---|
-not available-
Here we have a Deep Learning system image (AWS AMI) custom built for the p2.xlarge system, which at ~$1 per hour is a GREAT price for a machine as powerful as it is. I've had my share of difficulties with building various modules/packages/drivers personally, so I thought I would share this for any fellow ml/dl/ai researchers/professionals who feel they might benefit from not having to go through the lengthy process of debugging/hacking that I did.
I wanted to build a flexible system that would allow myself and others to seamlessly explore whatever machine learning / deep learning system I wanted to, while side-stepping the plethora of dependency conflict issues / custom build issues that inevitably seem to arise each time I try a new package. To that end, I wanted to start off with a clean system, and do a simul-build of all of the latest, bleeding-edge packages around. I'm sure I inevitably missed a few things here and there, but feel free to shoot me an email at mark.woods89 (at) gmail (dot) com if theres something you'd like to see added to this image.
This build is somewhat weighted towards python-facing systems because of my background, but I tried to also include many of the other deep learning environments that I've been hearing of around town.
On that note, please let me know if there's anything you'd like to see added to this image. I'll be keeping track of requests for the next iteration of this image at the bottom of this page under TODO.
Also it comes with a virtual desktop set up via vncserver in case you're into that kind of thing.
All installed packages/modules/languages/etc have been tested and confirmed to work properly with GPU (at least for simple tests).
Also, the GPU has been max-clocked ala:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using_cluster_computing.html#optimize_gpu
Get an aws account, initialize it with this machine image from 'Community AMIs' e.g. ami-a195cfb6 for US East (N. Virginia). Make sure you have your security groups set up appropriately. I personally like to allow all inbound/outbound traffic to/from my personal ip, but you can customize this how you want. port 22 for ssh, ports 5900/5901 for vnc.
User credentials:
username: icarus
password: changetheworld
Note: The software packages are on a user-specific install, so you need to log in as icarus
in order to access them properly. I plan to update this for the next release.
Reboot the instance. (this starts the init scripts properly). For SSH-only access, you will need to connect with your cert file from aws, as described here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EC2_GetStarted.html#ec2-connect-to-instance-linux . You can login as the default username ubuntu
, no password. Then switch to user: icarus
via $ su icarus
. Alternatively, I've also set up a virtual desktop on this image that can be accessed by vnc. See Step 5.
Run aws configure
so that you can use the super handy-dandy aws-cli.
To access the vncserver-hosted desktop, I personally like VNC Viewer, since it's pretty lightweight and simple to use. You can find it here: https://www.realvnc.com/download/viewer/
ec2-55-155-222-144.compute-1.amazonaws.com:1
)godeeper
If you are a master sysadmin and you think the word 'desktop' is dirty, you can disable the auto-load script at /etc/init.d/vncserver by deleting the file, or moving it to a backup or some such.
A few things still dont work quite properly due to the system being a headless server with the bleeding-edge drivers required for the installed packages.
Log into the instance via something like putty + xming, '$ vncserver -kill :1' to shut down the virtual desktop and run gnome-control-center from your terminal login. That will load a gui interface for that if you need it.
Do all the machine learning. In general, a decent way to use this system is just to assume that what you need is already installed on it and configured to work.
Just try some of these commands in the terminal:
$ th
$ pyspark
$ digits-devserver
$ cntk
$ python
>>> import caffe
>>> import tensorflow
>>> import theano
>>> import keras
For more information on build paths and such, please reference ~/.bashrc
. I tried to include most (if not all) of the relevant directories there.
Additionally, I set up a little script in the home directory to run commands sequentially followed by shutting down the instance. It's not pretty code, but for something that took less than 2 min to make, it comes in handy when you want to get more work done in a night, but dont want to waste $$ by leaving your instance running. Don't forget to alter it so that it uses your own instance ID!
Using AWS spot instances can save you 20-80% per hour. The downside is that AWS reserves the right to shutdown your instance if the demand for that instance type goes up. Zohar Jackson has provided a script to attempt to optimize aws spot instance choice and pricing if you want to go that route, you can check it out here: https://github.com/Jakobovski/aws-spot-bot It aims to find and launch the cheapest and least likely to get shutdown spot instances. AWS also provides spot instances that can be reserved for a set length of time.
If you find this useful, please shoot me an email over at mark.woods89 (at) gmail (dot) com . Hopefully it will save some of y'all a lot of the headaches I went through trying to get it set up myself.
Oh yeah, and if you're just getting into machine learning, remember: change the world.
This will get fleshed out a little in the future, but for now, if you're new to aws/ml/etc, here are some links that will help you get set up in your exploration of this system, and the machinations
Getting the image set up on aws: first, log in to your amazon account on the aws site. You may need to send a reqest to AWS to increase your p2 instance limit to 1. Should only take a few days to get approved. When you're clear to start a p2.xlarge instance, you can load the ami of your choice by going to the EC2 section of aws, clicking 'create instance', and searching for the ami of your choice.
Once you've created your new instance, it will take a couple minutes to boot up. Once it does so, reboot the instance once for good measure. From there, you can access it either by ssh
I personally use python for a lot of my work, and I especially like the jupyter notebook interface. To use it, just type jupyter notebook
into your terminal and hit enter. From there, you can create new notebooks and run sections of code as you like.
Cool, so now you have the jupyter notebook interface up and running. Next, you want to figure out what this whole 'machine learning' thing is. There are a few tutorials/classes around the web, here's one by the notorious A.NG:
The python package 'scikit-learn' is wonderful and incredible and delightful and things. Most things you want to do with non-deep learning models are already implemented quite nicely there. Check it out. Here's a little tutorial page to get you started:
Machine learning tasks explored with sklearn also interface quite nicely with python's pandas package, and since it's built from a numpy foundation, you can do a lot of vaguely complex things to large datasets pretty quickly.
A lot of very interesting problems in the field of machine learning/ai involve building systems that will be able to process and "understand" visual data in order to perform a task/analysis. A good starting point for this is the python package opencv. It's very cool, and very powerful when used properly. Check it out.
So now you're tired of that, and want to do some real deep learning and start making cool pictures like Google's inceptionism deep dreams. For a little introduction into the current methods/models being used in the field, check out this site:
Then, to start playing around with some more code and model-building:
Explore. There are a lot of installations listed here, and they each perform certain tasks in different ways. Figure out what you like, what sort of problems you want to work on, what tools are best-suited to solving them. Google is your friend. Machine learning/deep learning/ai is like a cave. The entrance is well-lit and well-known, but there's still plenty of pathways that have yet to be discovered. So grab yourself a torch(7) and find yourself a dark tunnel that nobody's been down yet.
aws p2.xlarge
Ubuntu 14.04 LTS (Trusty Tahr)
NVIDIA Tesla K80
50 Gb storage
aws-cli 1.11.10
NVIDIA CUDA 8.0
NVIDIA cuDNN 5.1
NVIDIA CUB 1.5.5
NVIDIA DIGITS 5.1-dev
NVIDIA CNMeM 1.0 (?)
Anaconda 2
python 2.7
opencv 3.1.0
spark 2.0.1
caffe 1.0.0-rc3
keras 1.1.0
tensorflow 0.11.0rc0
tf-learn 0.2.2
theano 0.8.2
torch 7
R 3.0.2
shiny 0.14.1
Scala 2.11.8
scala sbt 0.13.12
Julia 0.5
IntelliJ IDEA 2016.2.5 # For DeepLearning4J
maven 3.0.5
Microsoft CNTK
kaldi 1.0(?)
ATLAS 3.10.1
OpenBLAS 0.2.20
MKL
Microsoft MKLCustom
openMPI 1.10.3
zlib 1.2.8
libzip 3.0
Boost 1.54.0
protobuf 3.1.0
build compatibility for python3
build docker image
add mxnet
add nolearn
add lasagne
add octave
flesh out tutorial section
look into turning this into an ansible playbook?
Include dl4j
shogun-tk ?
include perl
include https://github.com/usnistgov/F4DE because it is glorious.
Suggestion: Install Cuda and cuDNN using the -no-opengl-files option. https://davidsanwald.github.io/2016/11/13/building-tensorflow-with-gpu-support.html