MADE-graduation-projects / hateful_memes

0 stars 0 forks source link

Check dataset for duplicates #9

Open tinctura13 opened 2 years ago

tinctura13 commented 2 years ago

I've randomly selected 10 pictures from the dataset, then I've moved them by 5 pixels horizontally and vertically and saved them to the same directory. Then I've rum my script to check for duplicates and it works (removed all duplicated/images with offset) Ran in on whole dataset and removed about 2k duplicate pictures from it code here