dask / dask-ec2

Start a cluster in EC2 for dask.distributed
106 stars 37 forks source link

Pipeline for using scikit and tensorflow #51

Closed raymondchua closed 7 years ago

raymondchua commented 7 years ago

Hi, I would like to know what is the the "correct" way of using distributed dask on amazon instances with tensorflow or scikit-learn. Do I have to install tensorflow in all the amazon instances? Can I submit my task from my local machine or do I have to ssh into the head node and run my task from there? In my python file, how can I run dask distributed functions? Is the following the right way?

import tensorflow as tf from distributed import Client

c = Client()

define some tensorflow variables and functions etc...

c.submit(some tensorflow functions)

mrocklin commented 7 years ago

You would have to install tensorflow on all of the worker nodes.

You could also use Dask itself to distributed a fully packaged conda environment (or any zipped self-contained directory with a python executable) (see https://github.com/dask/distributed/pull/494)

If you are using dask-ec2 then you'll probably want to specify the head machine as your scheduler address

#  c = Client()  # This makes  a small "cluster" just on this one machine
c = Client('127.0.0.1:8786')  # connect to already running scheduler on this machine
raymondchua commented 7 years ago

@mrocklin thanks for the info.

This is what I am trying to do. I doubt it is the correct way. Any advice? I am getting a EOFError: Ran out of input.

c = Client('127.0.0.1:8786')

import tensorflow as tf

#import some datasets as numpy array, X for train data, Y for train labels
X = da.from_array(np.asarray(X), chunks=(1000, 1000, 1000, 1000),lock=lock)
Y = da.from_array(np.asarray(Y), chunks=(1000, 1000),lock=lock)
X_test = da.from_array(np.asarray(X_test), chunks=(1000, 1000, 1000, 1000),lock=lock)
Y_test = da.from_array(np.asarray(Y_test), chunks=(1000, 1000),lock=lock)

#create a model with tensorflow, lets call it M
define some tensorflow functions....

c.run(M.fit, X, Y)
mrocklin commented 7 years ago

Run does not do what you expect. You might want to read through http://distributed.readthedocs.io/en/latest/manage-computation.html and http://dask.pydata.org/en/latest/array-api.html#dask.array.core.map_blocks or use http://dask.pydata.org/en/latest/delayed.html

I'm going to close this as off-topic for dask/dask-ec2. Feel free to reraise under dask/dask or better yet create a http://stackoverflow.com/help/mcve and ask on stackoverflow.