databricks / spark-deep-learning

Deep Learning Pipelines for Apache Spark
https://databricks.github.io/spark-deep-learning
Apache License 2.0
1.99k stars 494 forks source link

Add support for FloatType, IntegerType and LongType inputs to TFTransformer #108

Closed smurching closed 6 years ago

smurching commented 6 years ago

What

Currently, TFTransformer accepts only DoubleType input columns, casting Double input into the input type expected by the TF graph. This casting implicitly constrains TFTransformer to support only those TensorFlow input types that can be casted-to from double.

This PR removes the cast-from-double operation in TFTransformer, thereby adding support for FloatType, IntegerType and LongType input columns.

Why

The changes in this PR will facilitate adding support for BinaryType input columns in a follow-up

Summary of Changes

codecov-io commented 6 years ago

Codecov Report

Merging #108 into master will decrease coverage by 0.17%. The diff coverage is 92.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #108      +/-   ##
==========================================
- Coverage   82.84%   82.66%   -0.18%     
==========================================
  Files          34       34              
  Lines        1999     1990       -9     
  Branches       44       44              
==========================================
- Hits         1656     1645      -11     
- Misses        343      345       +2
Impacted Files Coverage Δ
python/sparkdl/transformers/tf_tensor.py 98.24% <92.85%> (-1.76%) :arrow_down:
python/sparkdl/graph/utils.py 98.94% <0%> (-1.06%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e3c5876...5bac9b6. Read the comment docs.

smurching commented 6 years ago

Chatted offline with @sueann - added a warning about the behavior change. Will merge now, thanks @sueann @tomasatdatabricks!