yinboc / liif

Learning Continuous Image Representation with Local Implicit Image Function, in CVPR 2021 (Oral)
https://yinboc.github.io/liif/
BSD 3-Clause "New" or "Revised" License
1.26k stars 145 forks source link

Extension to RGBA as well as RGB #12

Closed lmmx closed 3 years ago

lmmx commented 3 years ago

Hi there, I've been using LIIF on emoji glyphs and got some great results, however I'd like to recover the transparency, which I had to remove* by simple alpha compositing (i.e. flattening the image) before passing the PNG inputs to LIIF.

* flattening onto a grayscale background after calculating the grayscale tone not present in any semitransparent pixels, with greatest Euclidean distance from the median of the pixel mean in the image

I tried to "supervise" the estimation of transparency but it was only a rough estimate, and the results it gives are not satisfactory (despite the high quality obtained from LIIF)

This subsection of an emoji glyph was flattened against a black background then run through LIIF. The bottom right plot shows the recovered transparency (RGBA image) flattened against a different background colour (white)

I was wondering if you think the code could be modified in some way for this, to supervise an estimate of the alpha channel?

It seems like it should be possible but it's unclear to me how I might implement it, any advice would be appreciated

yinboc commented 3 years ago

Hi, thanks for your interest in our work! LIIF on emoji is really an interesting project.

I am not very familiar with the processing of alpha channel. Since the model is only trained for RGB images, the alpha channel will make it an out-of-distribution task. If there are a large amount of LR-HR pairs of images with alpha channel, a straight-forward method is to modify the code to work on 4-channel images (since all the code assume 3-channel, there can be many necessary modifications such as the encoder model part and the data normalization part) and train a 4-channel SR model.

lmmx commented 3 years ago

No problem, thanks for the response! 😄

JerryX1110 commented 3 years ago

No problem, thanks for the response! 😄

Have you tried this later? I am curious about the result.

lmmx commented 3 years ago

@JerryX1110 I am preparing a dataset of images with transparency here, I expect it will be ready soon, with many large images with a range of alpha values to train a network on.

I just found out there is a recent paper on joint implicit image functions “JIIF” (code) which modifies LIIF for the task of upsampling a low resolution depth image alongside the RGB image, which I think is comparable to the task I proposed (of upsampling an RGBA image). I’m curious to see how this goes too!