tgsmith61591 / skutil

NOTE: skutil is now deprecated. See its sister project: https://github.com/tgsmith61591/skoot. Original description: A set of scikit-learn and h2o extension classes (as well as caret classes for python). See more here: https://tgsmith61591.github.io/skutil
BSD 3-Clause "New" or "Revised" License
30 stars 9 forks source link

Add Functions Geared Towards Apache Spark #16

Open charlesdrotar opened 8 years ago

charlesdrotar commented 8 years ago

Not sure how valid this is (is this truly within the scope of this library?) or in what form this will rear its ugly head, but it would be neat to add some complimentary functions for Spark. This is as open-ended as it can be in order to Spark :bowtie: discussion. Ideally we would want to keep this functionality to after the emergence of Spark DataFrames so we can just leverage their existing DataFrame API.

tgsmith61591 commented 8 years ago

I like this idea. But we need to consider implications on CI testing if we go that route. We certainly won't bundle a Spark dist with our releases (as we don't with H2O); we could feasibly access all we need from the environment variable SPARK_HOME. It would also be cool to be able to integrate with H2O's sparkling water.. if we can figure out the testing component, I say it's worthy of exploring in a future release.

On Oct 16, 2016 2:55 PM, "charlesdrotar" notifications@github.com wrote:

Not sure how valid this is (is this truly within the scope of this library?) or in what form this will rear its ugly head, but it would be neat to add some complimentary functions for Spark. This is as open-ended as it can be in order to Spark [image: :bowtie:] discussion. Ideally we would want to keep this functionality to after the emergence of Spark DataFrames so we can just leverage their existing DataFrame API.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tgsmith61591/skutil/issues/16, or mute the thread https://github.com/notifications/unsubscribe-auth/AF10okQHmRUsSynGc2VIMEQzjd0uvwc5ks5q0oEzgaJpZM4KYEMs .

charlesdrotar commented 8 years ago

Completely agree! I think we could do something inside Sparkling Water as well. This is going to be fun/interesting to see how this shapes out, especially with regards to Travis CI in particular