Drop duplicates on a subset of columns

Gmousse / dataframe-js

No Maintenance Intended

https://gmousse.gitbooks.io/dataframe-js/

MIT License

460 stars 38 forks source link

Drop duplicates on a subset of columns #39

Closed martinv13 closed 6 years ago

martinv13 commented 6 years ago

Is it possible to filter out duplicates only on a subset of columns, keeping for instance the first encountered value for other columns (as .dropDuplicates() does not take any argument I guess it is not)? Alternatively, would it be easier to add a mult="first" argument like in R's data.table in left joins in order to join only with the first matching row and discard other matching rows? Thanks, Martin

Gmousse commented 6 years ago

hi @martinv13. Indeed there is not easy ways to achieve that. I work on a next release today, I will look for a fast implementation.

martinv13 commented 6 years ago

Hi @Gmousse, I created a PR #44 related to this issue, if you want to check it out.

Gmousse commented 6 years ago

Closes with #44. It will be released in 1.3.0 ( Thanks !