ddf-project / DDF

Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine
http://ddf.io
Apache License 2.0
168 stars 42 forks source link

Add support to scale just a subset of columns plus and inPlace flag #340

Closed ducleminh closed 8 years ago

ducleminh commented 8 years ago

Description and related tickets, documents

Add support to scale a subset of columns (to both scaleMinMax and scaleStandard) Add inPlace flag (to both scaling methods)

Reviewers: @hai-adatao @Huandao0812 @nhanitvn @phvu @lebinh @ubolonton @zkidkid

Breaking changes & backward compatible issues

None

How to test

Unit tests

PR Progress

Make sure all checkboxes below are checked before merged

hai-adatao commented 8 years ago

Will merge in 1.4.17, we need all PRs in downstream ready

Huandao0812 commented 8 years ago

@ducleminh: can you research to see if we could do this with Spark DataFrame, it's not good practice to generate the SQL string then pass it into SparkSQL

ubolonton commented 8 years ago

Seconded @Huandao0812's suggestion.

ducleminh commented 8 years ago

Yes, ticket raised for this: https://adatao.atlassian.net/browse/PE-2082

nhanitvn commented 8 years ago

lgtm. I am going to merge this.