salesforce / TransmogrifAI

TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
https://transmogrif.ai
BSD 3-Clause "New" or "Revised" License
2.24k stars 395 forks source link

PySpark Support #301

Closed oelesinsc24 closed 5 years ago

oelesinsc24 commented 5 years ago

Does TransmorgrifAI support PySpark or is will this be supported in the future or is there a workaround for usage in PySpark?

Thanks in advance!

py-ranoid commented 5 years ago

Hey @oelesinsc24, TransmogrifAI is written in Scala and doesn't support PySpark yet.

While I can't comment on it officially, I may implement a PySpark wrapper in the future to use TransmogrifAI with Python. If it helps, this is how I thought I'd go about it :

If I may ask, which features of TransmogrifAI were you hoping to run with PySpark?

oelesinsc24 commented 5 years ago

Thanks for the prompt feedback

OElesin commented 5 years ago

I'm hoping to make TransmogrifAI available on AWS SageMaker, mainly the auto feature extraction.

tovbinm commented 5 years ago

@OElesin that would be great! I think it was already done once. Perhaps @wsuchy @gerashegalov @Jauntbox can share the recipe on how to do it?

OElesin commented 2 years ago

Any comments here!?