jantic / DeOldify

A Deep Learning based project for colorizing and restoring old images (and video!)
MIT License
17.92k stars 2.55k forks source link

Uneven color distribution in stable and artistic model outputs #503

Closed dummyuser-123 closed 3 months ago

dummyuser-123 commented 3 months ago

Hi, First of all hats off to this amazing deep learning model. It is generating fantastic results in most of the cases in various image categories. While I was trying out artistic and stable model on the different different category of images, I was saw some uneven color distribution at some part of images in both the models.

  1. Artistic Model

artistic_model_35_10143179 artistic_model_35_10143226 artistic_model_35_10143582 artistic_model_35_10144159 artistic_model_35_10144580 artistic_model_35_10144635 artistic_model_35_10144798 artistic_model_35_10145014 artistic_model_35_10145132

  1. Stable Model

stable_model_10_10375436 stable_model_10_10375450 stable_model_10_10375566 stable_model_10_10375773 stable_model_10_10375928 stable_model_10_10375948 stable_model_10_10380013 stable_model_10_10380096 stable_model_10_10380144

In this artistic vs stable model comparison, you can see artistic model is giving better results in human images compare to other category images but stable model is giving better results in other category images compare to human images. Means stable model is creating red color portion around lip and teeth part, which is the only problem in such outputs. So, is there any way I can good results in all category of images ? be it human or any other images. Means I am finding equilibrium in all category of images.

Like I have gone through the whole readme file and issue section for this problem but I haven't got any solution. So, I was thinking to train FFHQ dataset on artistic model. Would this be an good way to generate better results on human cases or I am missing any details while inferencing the images ?

@jantic it would be great if you can give some idea or solution about this problem.

jantic commented 3 months ago

As far as datasets that would help in terms of good results across all categories of images- I'd say Google's Open Images dataset ( https://storage.googleapis.com/openimages/web/index.html ) is probably your best bet. It's huge in comparison to what DeOldify was originally trained on, and is much more diverse. But the problems go beyond just data. I've moved on past this particular training approach years ago, but if you were to improve upon it, my guess is increasing the batch size would go a long way. I had very limited compute resources when training this (1080 TI gpu with 11 GB VRAM!). A lot of things have changed in the past 5 years!

Since it has been 5 years, there's definitely better approaches out there, too. I've seen diffusion based stuff and I'd say that's probably the way to go now ( https://palette.fm ).