Justin-Tan / high-fidelity-generative-compression

Pytorch implementation of High-Fidelity Generative Image Compression + Routines for neural image compression
Apache License 2.0
411 stars 77 forks source link

Large overhead for vectorized ans encoding #31

Closed xjh19971 closed 3 years ago

xjh19971 commented 3 years ago

Hi,

Thank you for this wonderful repo! I have a question about vectorized ans encoding.

The situation is that my images have the size 256x256. When I am using ans encoding, the file size is about 1kb. But after switching to vectorized ans encoding, the file size comes to 6kb. I think it's a large overhead.

Do you have any idea about that? Or, are there any relevant parameters I need to finetune? Thank you again!

Justin-Tan commented 3 years ago

Hi,

IIRC rANS incurs a constant overhead when encoding that becomes negligble for long sequences. Vectorized rANS incurs multiple copies of this constant overhead, so is even more wasteful (see discussion in Section2 here 1 and this paper 2) - but again, this becomes negligible for long sequences - unless we are working with small filesizes, which is what you seem to be doing. I was planning to fix this but got distracted unfortunately.

I think your best bet is to write an arithmetic coder that uses the PDFs/CDFs learnt in this repository if fast, efficient compression is important for you. Fabian Mentzer has a cool implementation in his torchac repo 3.

xjh19971 commented 3 years ago

Thank you! That makes sense.

GimmeSomeJazz commented 1 year ago

hey @xjh19971, did you manage to use ans encoding (from this repo or the one @Justin-Tan mentioned? Thank you all!