cloud-py-api / mediadc

Nextcloud Media Duplicate Collector application
https://apps.nextcloud.com/apps/mediadc
GNU Affero General Public License v3.0
95 stars 8 forks source link

Any file comparison #40

Closed ykutovoy closed 2 years ago

ykutovoy commented 2 years ago

Dear friends. Is it possible to use your hash comparison mechanism to find duplicates of other type files?

For instance add MimeType "other files" if selected other , hide field with "Similarity threshold" and set 100%

bigcat88 commented 2 years ago

No, that is not possible. This is not a crypto hash functions, this is a perceptual hashes. There is two links in our wiki about how they works.

It is possible to find similarity between texts or music files, using another algorithms. But that will be another app :) We are currently work on a very interesting project that will allow a very easy development of apps for Nextcloud with python part, unifing install part and access to db/files, when we will release it, after that it willl be not so hard to try develop duplicates collectors for text , music or any other task related to AI.

ykutovoy commented 2 years ago

It is almost magic for me :) (Links with math of hashes)

As I said in other issue, I'm not programmer, just user, and can not do anything without Your help ) And even if You create such cool app which simplifies creation of apps, it would be extra hard for me :(

But it looks natural for me to give user ability to choose between simple crypto hashes or perceptual hashes may be it will add some heavity to Your app, but from my side it is better to have one app for search duplicates with additional tabs/buttons/fields for different types of files, than I'll get the bunch of apps one for images, second for videos, other for text, next one for MD5/SHA-1/SHA-2/SHA-3/BLAKE2 And of course each app will be with its own GUI and logic...

andrey18106 commented 2 years ago

Closing as invalid.