serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
https://www.youtube.com/watch?v=WnUVYQP4h44&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=1
MIT License
11.94k stars 2.03k forks source link

Manipulating representations.pkl file #528

Closed falkaabi closed 2 years ago

falkaabi commented 2 years ago

Thought to share this as contribution

@serengil You could probably add some of these functions in your code and use them elsewhere as I've noticed you build representations in multiple different ways in your code.

Here's something you could probably use:

  1. Have a function to load representations.
def load_representations(db_path=DB_PATH, file_name=file_name):
    ''' loads face representations from a file'''
    with open(os.path.join(db_path, file_name), 'rb') as f:
        representations = pickle.load(f)
    return representations
  1. Another function to build representations if they don't exist

    def build_representations(model,db_path=DB_PATH, model_name=MODEL_NAME, distance_metric=METRIC, enforce_detection=True, detector_backend=BACKEND, align=True, normalization=NORMALIZATION, prog_bar = True):
    ''' builds representations from images in DB_PATH'''
    # build Ensemble model instead of just using 1 model
    #--------------------------------
    if model_name == 'Ensemble':
        model_names = ["VGG-Face", "Facenet", "OpenFace", "DeepFace"]
        metric_names = ["cosine", "euclidean", "euclidean_l2"]
    else:
        model_names = [model_name]
        metric_names = [distance_metric]
    #--------------------------------
    # Builds model if it was not passed and loads it once to be used for building all representations
    if model == None:
        if model_name == 'Ensemble':
            LOGGER.info("Ensemble learning enabled")
            models = DeepFace.Boosting.loadModel()
        else: # model is not ensemble
            model = build_model(model_name)
            models = {model_name: model}
    else: # model != None
        LOGGER.info("Already built model is passed")
        if model_name == 'Ensemble':
            DeepFace.Boosting.validate_model(model)
            models = model.copy()
        else:
            models = {model_name: model}
    #------------------------------
    
    # get image paths from DB_PATH and store them into employees list
    employees = get_images_from_path(path=db_path)
    
    # find representations for db images
    representations = []
    pbar = tqdm(range(len(employees)), desc='Finding representations', disable = prog_bar)
    
    # for employee in employees:
    for index in pbar:
        employee = employees[index]
        pbar.set_description(f'Finding representations for {os.path.basename(employee)}')
        instance = [employee]
    
        for j in model_names:
            custom_model = models[j]
    
            representation = DeepFace.represent(img_path=employee
                                       , model_name=model_name, model=custom_model
                                       , enforce_detection=enforce_detection, detector_backend=detector_backend
                                       , align=align
                                       , normalization=normalization
                                       )
    
            instance.append(representation)
    
        # -------------------------------
    
        representations.append(instance)
    
    with open(os.path.join(db_path, file_name), "wb") as f:
        pickle.dump(representations, f)
    LOGGER.info(f"Representations stored in {os.path.join(DB_PATH, file_name)} file. Please delete this file when you add new identities in your database.")
    return representations
  2. Then use a function to decide whether to build or get representation from file

    def get_representations(model,db_path=DB_PATH,file_name=file_name, model_name=MODEL_NAME, distance_metric=METRIC, enforce_detection=True, detector_backend=BACKEND, align=True, normalization=NORMALIZATION, prog_bar = True):
    if os.path.exists(os.path.join(db_path, file_name)):
        if not silent: LOGGER.warning(f"Representations for images in [{db_path}] folder were previously stored in [{file_name}]. If you added new instances after this file creation, then please delete this file and call find function again. It will create it again.")
        representations = load_representations(db_path=db_path, file_name=file_name)
    
    else:  # build representations from scratch and store them in representation.pkl
        LOGGER.info(f'Building representations for images in {db_path}')
        representations = build_representations(model=model,db_path=db_path, model_name=model_name, distance_metric=distance_metric, enforce_detection=enforce_detection, detector_backend=detector_backend, align=align, normalization=normalization, prog_bar = prog_bar)
    
    LOGGER.info(f"There are ({len(representations)}) representations found in [{file_name}].")
    return representations
  3. Also, you could probably change how you get images from the database folder using this (It's easier to add more image file extensions this way):

    def get_images_from_path(path=DB_PATH):
    image_paths = []
    valid_image_extensions = ['.jpg', '.jpeg', '.png'] #you could add other valid image extensions here!
    for r, d, f in os.walk(DB_PATH):  # r=root, d=directories, f = files
        for file in f:
            ext = os.path.splitext(file)[1]
            if ext.lower() in valid_image_extensions:
                exact_path = os.path.join(r,file)
                image_paths.append(exact_path)
    
    if not image_paths:
        raise ValueError(f"There is no image in {DB_PATH} folder! Validate .jpg or .jpeg or .png files exist in this path.")
    return image_paths

I have used os.path.join() instead of concatenating strings since this is better when porting this code to other OS's

All these could be related to #516 enhancements

LOANPIA commented 2 years ago

In addition, I put pickle in lab to good use past, but passed several try-times and references I have come to know... the pickle file only stores a small amount of space. To me, I don't think it's the best thing to do. Let me share about what I said here

Ok, lovely notification brought me back to comment there, i mean my previous approach didn't seem right and face_recognition author Adam changed my mind, i should just keep the face embedding code on a database instead, i'm learning a database support for deepface without json on my windows os, postgresql in particular. May I ask if Sefik's DeepFace works the same way as other frameworks? Please check map on photo below image

serengil commented 2 years ago

this issue will be followed under #516