Closed legaultpierre closed 5 months ago
When there are so many variables, it can be really tiring to choose each one one by one. It would be great if there was an option to select all to see and delete duplicate rows.
@legaultpierre when clicking "View Duplicates" did you select a column from the dropdown above the button? If you don't then you will hit this error. I can update the UI to disable the button until a column is selected, but for now that will solve your problem
@T-a-c-h-y-o-n I'll look into adding a "select all" option
@legaultpierre just released v3.10.0 to pypi (should be on conda-forge soon) with this update to hide the "View Duplicates" button included.
Also, if you haven't already, please put your ⭐ on the repo when you get a sec. Thanks! 🙏
Hello @aschonfeld , sorry for the (very) late response, work life has better a little bit overwhelming ! Thanks a lot for the changes, that's what I needed!
Hello !
First, thanks for your work, I just discovered your library and I already love it !
Context
I am using your tool on MSLR-WEB10K > Fold1 > train dataset. I want to know what the duplicates are.
How the behaviour / error happened
Once the GUI launched with the code in the following section, I navigate to Visualize > Duplicates > Show Duplicates > View Duplicates. Once clicked, I get the following error:
Code to reproduce
Lib versions in env:
Question
Is this a wanted behaviour ? (You want to force people to select some columns) Is this a bug ?
If this is the wanted behaviour, please consider to add the functionality of showing the duplicate rows without having to select columns: in datasets with a lot of features, it is quite annoying to select them all !
Thanks in advance :-)