adatao / tensorspark

TensorFlow on Spark
297 stars 98 forks source link

Distributed TensorFlow on Spark

First presented at the 2016 Spark Summit East: [Slide deck] (, [Presentation video] (, [Blog post] (

TensorSpark productionalized in yarn-cluster mode

This latest version contains modifications/improvements that are mostly relevant to someone interested in taking TensorSpark to production in yarn-cluster mode (tested with a Hortonworks distribution [HDP 2.4] with CPU machines). For other deployment and machine types, the earlier version as of [Commit #62] ( might still be a better option.

Summary of changes since [Commit #62] (

There are few minor improvements (see commits for details) and the following 2 major changes:

To run

  1. zip ./ ./ ./ ./ ./ ./
  2. spark-submit \
    --master yarn \
    --deploy-mode cluster \
    --queue default \
    --num-executors 3 \
    --driver-memory 20g \
    --executor-memory 60g \
    --executor-cores 8 \
    --py-files ./ \

Partial project layout:
tensorspark/ - script to build tf from source with gpu support for aws
tensorspark/ - simple tornado websocket example
tensorspark/ - "abstract" model class that has all tensorspark required methods implemented
tensorspark/ - specific fully connected models for specific datasets
tensorspark/ - convolutional model for mnist
tensorspark/ - spark worker code
tensorspark/ - entry point and spark driver code