huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
18.72k stars 2.58k forks source link

`drop_duplicates` method #7016

Open MohamedAliRashad opened 3 days ago

MohamedAliRashad commented 3 days ago

Feature request

drop_duplicates method for huggingface datasets (similiar in simplicity to the pandas one)

Motivation

Ease of use

Your contribution

I don't think i am good enough to help

Dref360 commented 2 days ago

There is an open issue #2514 about this which also proposes solutions.