eDeveloperOZ / news_il

0 stars 0 forks source link

Suggestion |filtering images , reduce the duplication Imagehash #1

Open vadim566 opened 11 months ago

vadim566 commented 11 months ago

The method explanation

https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

Test that has to be done

install pip install imagehash

from PIL import Image
import imagehash

# Load images
image1 = Image.open("path/to/image1.jpg")
image2 = Image.open("path/to/image2.jpg")

# Convert images to grayscale
image1 = image1.convert("L")
image2 = image2.convert("L")

# Generate perceptual hashes
hash1 = imagehash.phash(image1)
hash2 = imagehash.phash(image2)

# Print the hashes
print("Hash 1:", hash1)
print("Hash 2:", hash2)

# Compare the hashes
hamming_distance = hash1 - hash2
print("Hamming Distance:", hamming_distance)
vadim566 commented 11 months ago

in the future , when there is a new image is coming to the stream it will automatically hashed and it will be added to a set with all images hashes, if the array was gained by one so its unique image if not its a indication for duplication