Open Deathproof76 opened 6 months ago
With some AI-Guidance: Maybe like this in imageDuplicate.py?
import os
import streamlit as st
import time
import torch
import numpy as np
import faiss
from torchvision.models import resnet152, ResNet152_Weights
from torchvision.transforms import Compose, Resize, ToTensor, Normalize
from PIL import Image
from api import getImage
from utility import display_asset_column
from api import getAssetInfo
from db import load_duplicate_pairs, is_db_populated, save_duplicate_pair
from streamlit_image_comparison import image_comparison
# Set the environment variable to allow multiple OpenMP libraries
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"
# Load ResNet152 with pretrained weights
model = resnet152(weights=ResNet152_Weights.DEFAULT)
model.eval() # Set model to evaluation mode
def convert_image_to_rgb(image):
"""Convert image to RGB if it's RGBA."""
if image.mode == 'RGBA':
return image.convert('RGB')
return image
transform = Compose([
convert_image_to_rgb,
Resize((224, 224)), # Standard size for ImageNet-trained models
ToTensor(),
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
...
Update: I simply mounted the modified imageDuplicate.py. And it's processing further, stopped the process to check for duplicates and it seems to work so far 👍
Hi, thank you for this project, really cool idea! I built the docker image and everything went fine up until 4000 of 40000 assets.
(btw: maybe move the databases to a separate folder for simpler mounting like https://github.com/vale46n1/immich_duplicate_finder/compare/main...ttlequals0:immich_duplicate_finder:main )
logs up until erroring out:
I'm not a programmer but maybe it's related to an image with an alpha channel, RGBA instead of RGB? https://github.com/StoryMY/take-off-eyeglasses/issues/1#issuecomment-1122144903 https://github.com/christiansafka/img2vec/issues/31 https://github.com/christiansafka/img2vec/pull/37/commits/1cc7e273cc23822765be34aac90775b0fbb31252
edit: I read a bit on the web and it seems that resnet is only trained for 3-channel images, rgb not rgba so images would most likely have to be transformed before comparison.