Closed netskink closed 2 years ago
Ok this sounds like more utility
Create table or array with components to diff by md5 hash
Md5sum run against camera uploads folder .. pipe print to stdout. Then reject the duplicate hashes
Yes, I know how to do it. Add the code to do it, so we can run it periodically with cron.
Talked with @ArjunPanwar2005 about this.
install cygwin if you are on windows. It will give you bash.
check out this PR: https://github.com/rtp-aws/devpost_aws_disaster_response/pull/11. There are 1,400 duplicates. The notebook will delete them or I can run it and delete them on my end which will create a PR to delete the duplicates.
@adrianxdev
I approved the pull request. I was looking to see if it was something pulled with an existing license. It looks like its your original code. /rockandroll
fwiw, here is another
https://towardsdatascience.com/removing-duplicate-or-similar-images-in-python-93d447c1c3eb
I want to have it as a bash script or straight .py file so I can add it to the cron file. I suppose I can run a notebook, but can you make a .py please before we close the issue.
i got it from here: https://medium.com/@urvisoni/removing-duplicate-images-through-python-23c5fdc7479e with minor tweaks to work in this env.
He has the py here: https://github.com/UrviSoni/remove_duplicate_image/blob/master/duplicate_image_remove.py
perhaps not needed but to be pedantic it could be done.