lfranke / TRIPS

https://lfranke.github.io/trips/
MIT License
496 stars 28 forks source link

Some miscellaneously possible typo in eq.5 of the paper? #2

Closed deepshwang closed 4 months ago

deepshwang commented 5 months ago

First of all, thanks for the great work!

I was reading through your paper and did not quite get Eq.(5).

From my understanding, each layer in pyramid is defined by log-scale of s, where s space is defined by 2^{L} with integer index L for layers.

From that, I thought the equation (5) for condition s > 1 would be 1- | log2(s) - log2(s_i) | , which makes sense as when s is larger than 2^{L_i + 1}, the equation would yield negative opacity, which can work as a flag to ignore that pyramid layer.

Can you please elaborate on this?

Again, thank you for the great work

lfranke commented 4 months ago

Yes thank you! There is a typo in this equation. There are two ways to formulate the equation: either what you proposed (taking the log of the values) or by dividing by the pixel-range of the layers. We tested both and the second provide slightly better results for us (about 0.01 in LPIPS or 0.1dB PSNR). I will update the paper soon, the formula should look like this:

image

deepshwang commented 4 months ago

Thank you! This is clear enough for me