TencentARC / GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
Other
35.95k stars 5.96k forks source link

Image blending problem while caching the gfpgan model #519

Open dummyuser-123 opened 9 months ago

dummyuser-123 commented 9 months ago

I have created an API for Real-ESRGAN using FastAPI, and it is working properly for multiple user requests. However, when I am initially loading the models (Real-ESRGAN and GFPGAN) using lru_cache (functools) to decrease the inference time, I am encountering following two errors during execution.

1. Sometimes I have getting faces of one user request mixed up with another user request.

image

2. In some requests, I have getting following error.

Traceback (most recent call last):
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/middleware/errors.py", line 164, in _call_
    await self.app(scope, receive, _send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/middleware/exceptions.py", line 62, in _call_
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/routing.py", line 758, in _call_
    await self.middleware_stack(scope, receive, send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/routing.py", line 74, in app
    response = await func(request)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/fastapi/routing.py", line 299, in app
    raise e
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/fastapi/routing.py", line 294, in app
    raw_response = await run_endpoint_function(
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/fastapi/routing.py", line 193, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/starlette/concurrency.py", line 42, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
    return await future
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/api.py", line 102, in process_image
    intermediate_image = hd_process(img_array)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/api.py", line 58, in hd_process
    , , output = face_enhancer.enhance(img_array, has_aligned=False, only_center_face=False, paste_back=True)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/gfpgan/utils.py", line 144, in enhance
    restored_img = self.face_helper.paste_faces_to_input_image(upsample_img=bg_img)
  File "D:/Image Super Resolution/Models/Real-ESRGAN/env/lib/site-packages/facexlib/utils/face_restoration_helper.py", line 291, in paste_faces_to_input_image
    assert len(self.restored_faces) == len(self.inverse_affine_matrices), ('length of restored_faces and affine_matrices are different.')
  AssertionError: length of restored_faces and affine_matrices are different.

This is the small code snippet from my api:

@lru_cache()
def loading_model():
    real_esrgan_model_path = "D:/Image Super Resolution/Models/Real-ESRGAN/weights/RealESRGAN_x4plus.pth"
    gfpgan_model_path = "D:/Image Super Resolution/Models/Real-ESRGAN/env/Lib/site-packages/gfpgan/weights/GFPGANv1.3.pth"

    model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
    netscale = 4

    upsampler = RealESRGANer(scale=netscale,model_path=real_esrgan_model_path,dni_weight=0.5,model=model,tile=0,tile_pad=10,pre_pad=0,half=False)
    face_enhancer = GFPGANer(model_path=gfpgan_model_path,upscale=4,arch='clean',channel_multiplier=2,bg_upsampler=upsampler)

    return face_enhancer

def hd_process(file):

    filename = file.filename.split('.')[0]
    save_path = os.path.join("temp_images", f"{filename}.jpg")

    content = file.file.read()
    with open(save_path, 'wb') as image_file:
        image_file.write(content)

    img_array = cv2.imread(save_path, cv2.IMREAD_UNCHANGED)

    face_enhancer = loading_model()

    with torch.no_grad():
        _, _, output = face_enhancer.enhance(img_array, has_aligned=False, only_center_face=False, paste_back=True)

    output_rgb = cv2.cvtColor(output, cv2.COLOR_BGR2RGB)

    del face_enhancer
    torch.cuda.empty_cache()

    return output_rgb

So, when I went through the code of GFPGAN, I found that GFPGANer contains an "enhance" function which calls the "facexlib" library for face enhancement and face-related operations. The "enhance" function clears all list variables of "facexlib" after every execution by reinitializing them. This type of behavior is only observed when I load the model into the cache; otherwise, it works properly. Is there any way to cache the model and also resolve this error?

lianarosee commented 9 months ago

20240221_225128_058-01-03

ssskhan commented 8 months ago

well you can move following lists from facexlib/face_restoration_helper.py to enhance function in gfpgan/utils.py, it will solve the problem because every request will have its own list and wont be mixed again.

self.all_landmarks_5 = []            
self.det_faces = []                  
self.affine_matrices = []            
self.inverse_affine_matrices = []    
self.cropped_faces = []              
self.restored_faces = []             
self.pad_input_imgs = []      

but the problem I am getting after doing it is that the quality is low when processing concurrent requests, any ideas ?

dummyuser-123 commented 8 months ago

Thanks for the answer, I have already implemented this logic in my code. And I am not facing any quality issue during concurrent requests. If possible, can you tell me briefly about which api framework you are using and how you are able to implement concurrent requests in that framework, so that I can get more idea about the problem.

ssskhan commented 8 months ago

I am using waitress, and the issue is that when I send 2 requests at the same time the first works just fine but the output of second image is not good,

ssskhan commented 8 months ago

check the difference between the two images the first one is when send with another image simultaneously and the second one is when I send it alone

image

image

dummyuser-123 commented 8 months ago

First of all, I have never used waitress for api development. But I can tell you some general point that you can check:

  1. Compare the input image and output image (First case from above) in terms of size and dimension, so that can you cross check whether the image enhancement process happening on the image or not.
  2. As per your description, I think waitress is not able to processes multiple requests, means there is some problem during handling the parallel requests.
  3. Try out test with different variety of images (Like single person image, small size image), so that you might find any clue from these outputs.
dummyuser-123 commented 8 months ago

Also, have you worked on fastapi ever for api creation ? Actually I have used fastapi for this model but I don't know I am not able to achieve parallelism for more users. So, do you have any idea about this problem ?

ssskhan commented 8 months ago

nope, I never tried fast api sorry, when I send request from a single device it works perfectly, no matter how many faces in it, the problem only occur when it is working on multiple images at the same time, if I send 2 concurrent images with 1 face each, it works far better, kindly try 2 concurrent requests with multiple faces in the images and make sure they are different images. How did you find out it is not running concurrently ? like the time it took or something else ?