ddf-project / DDF

Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data Engine
http://ddf.io
Apache License 2.0
167 stars 42 forks source link

Add multiple-column sort functionality to DDF #325

Closed ducleminh closed 8 years ago

ducleminh commented 8 years ago

Description and related tickets, documents

Support sort in DDF: with multiple columns.

Reviewers: @hai-adatao @nhanitvn @Huandao0812

Breaking changes & backward compatible issues

Is this PR a breaking change or has backward compatible issue (e.g: changes in API names, interfaces, signature / remove something...)? If yes, please state out what and tag, and if it can possibly affect BA, please tag aht, tuananh, khangpham and baonguyen

How to test

Describe how this PR is tested. In case manual testing is required, describe how to do so.

PR Progress

Make sure all checkboxes below are checked before merged

hai-adatao commented 8 years ago

Very good tests @ducleminh, I have a question though, what if I have 4 columns to sort and specify only one or two order? Will the rest are default to true or will throw exception?

PangZhi commented 8 years ago

@ducleminh Do we have PR for pyclient?

PangZhi commented 8 years ago

@hai-adatao @ducleminh This feature is needed by SQ. Can we push it and add API in pyclient? Or I can help to add.

hai-adatao commented 8 years ago

@PangZhi RClient and PyClient had PRs for this, I'm working with BA to verify if we need backward compatibility or not

ducleminh commented 8 years ago

PR for PyClient: https://github.com/adatao/PyClient/pull/307

@hai-adatao to your question, in this case, no exception is thrown. By default, if not specified it will be true (i.e: ascending)