databricks / spark-deep-learning

Deep Learning Pipelines for Apache Spark
https://databricks.github.io/spark-deep-learning
Apache License 2.0
2k stars 494 forks source link

Add multiple outputs to the TFImageTransformer (second attempt) #171

Closed thunterdb closed 5 years ago

thunterdb commented 6 years ago

This PR adds support for outputting multiple elements from the low-level TFImageTransformer. With this PR, one can then take TensorFlow's object detectors and run them at scale with Spark.

This adds a new output mode ("sql"): under this mode, the output of running tensorflow is added as extra columns to the resulting dataframe and is not interpreted as images or vectors.

Includes test.