Orange-OpenSource / Cool-Chic

Low-complexity neural image & video codec.
https://orange-opensource.github.io/Cool-Chic/
BSD 3-Clause "New" or "Revised" License
102 stars 6 forks source link

Regarding decoding, have you considered using a GPU? #8

Closed ReBenDish closed 2 months ago

ReBenDish commented 3 months ago

We greatly appreciate your company's work, but the model still takes around 800ms for 720p images on Mac or devices without AVX2, which is somewhat slow, especially when considering deployment on mobile devices. Have you considered using a GPU for decoding?

theoladune commented 3 months ago

Hi, thanks for your interest in our work :).

I imagine you've measured the runtime when decoding the provided bitstreams. Most of these bitstreams have been obtained using the high-complexity operating point (HOP) described in cfg/dec/hop.cfg, which gives the best compression performance while keeping the MAC / pixel reasonably low.

However, our current decoder implementation is not that optimal for this configuration, as presented in the following graphs. Scrolling down a bit to see the BD-rate vs. decoding runtime on CLIC20, we see that the mid-complexity operating point (MOP) is much faster than the HOP (350 ms vs 1050 ms) at the cost of a few additional percents of rate.

So :

  1. If decoding time is an important thing for you, you should use the MOP configuration: --dec_cfg=cfg/dec/mop.cfg
  2. Upcoming updates will improve the decoder CPU implementation (regardless of AVX2 availability) yielding significant speed-up
  3. GPU implementation on the decoder is not our main priority right now. In fact we'd rather like to show that good decoding speed can be obtained without dedicated hardware such as GPUs.