databricks / spark-deep-learning

Deep Learning Pipelines for Apache Spark
https://databricks.github.io/spark-deep-learning
Apache License 2.0
1.99k stars 494 forks source link

[New Feature] What would it take to generalize to non-image data? #54

Open jvmncs opened 6 years ago

jvmncs commented 6 years ago

Loving this library, however, restricting to image data seriously constrains usage of the package in DL production systems. I think extending the API and docs to cover these use cases will help the community dramatically. It seems to me that this could be the most logical next step for this project, although I'm not familiar with the core contributors' plans.

I haven't dug into the internals completely, so I just wanted to ask if anyone knows what would be required to extend the transformers to include numeric/text data. Concretely, it seems like we'd only need to accommodate different shapes of tensors. If that's the case it might only require adding new transformers and changing a few variable/class names, although I'm not familiar with the underlying implementation well enough to say so certainly.

Anyone closer to the project have any comments? Ideally, I'd like to use this issue as a place to organize our thoughts on what exactly would need to change in order to extend the API in this manner.

sueann commented 6 years ago

Hi @jvmancuso, yeah I think that certainly makes sense. There is a series of PRs out for adding support for 1-d numeric tensor (i.e. vector) inputs (https://github.com/databricks/spark-deep-learning/pull/49, https://github.com/phi-dbq/spark-deep-learning/pull/9, etc). I think that should cover basic text transformation needs as well, since I think usually one would convert text into a vector to use the neural nets. What do you think? Is there room for making working with text more convenient there?

On a related note, it may also make sense to provide higher-level text-specific tools that parallel the image ones currently in the library. For example, we have DeepImageFeaturizer that featurizes, or embeds, the images into a more meaningful space. There are quite a few of those for text we can provide out of the box here potentially.