Orange-OpenSource / Cool-Chic

Low-complexity neural image & video codec.
https://orange-opensource.github.io/Cool-Chic/
BSD 3-Clause "New" or "Revised" License
102 stars 6 forks source link

Decoding time for a single image #6

Closed hytu99 closed 8 months ago

hytu99 commented 8 months ago

Thanks for your efforts! I have used your script to decode the prepared bitstream kodim01-lmbda-0001.bin, and the reported time is about 3~4s. Is the result normal? I feel that the decoding speed is relatively slow, but I am uncertain about the underlying reasons.

The CPU in use is Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz. And I run the script in my docker container.

Below is the original output for reference.

Using 2 cpu cores

Content of the GOP header:
---------------------------
      n_bytes_header: 9
            img_size: (512, 768)
     frame_data_type: rgb
            bitdepth: 8
        intra_period: 0
            p_period: 0
         ---------------------
Intra period is 0 and P period is 0: all intra coding!
got -1: upsampling bias

Content of the frame header:
------------------------------
      n_bytes_header: 74
latent_n_resolutions: 7
    latent_n_2d_grid: 7
  n_bytes_per_latent: [34304, 3396, 556, 604, 276, 112, 40]
     n_ft_per_latent: [1, 1, 1, 1, 1, 1, 1]
 n_hidden_layers_arm: 2
             dim_arm: 24
upsampling_kernel_size: 8
static_upsampling_kernel: False
           flow_gain: 1
    layers_synthesis: ['40-1-linear-relu', '3-1-linear-relu', '3-3-residual-relu', '3-3-residual-none']
     q_step_index_nn: {'arm': {'weight': 1, 'bias': 3}, 'upsampling': {'weight': 6, 'bias': -1}, 'synthesis': {'weight': 6, 'bias': 5}}
      scale_index_nn: {'arm': {'weight': 26, 'bias': 22}, 'upsampling': {'weight': 30, 'bias': -1}, 'synthesis': {'weight': 40, 'bias': 40}}
          n_bytes_nn: {'arm': {'weight': 1020, 'bias': 40}, 'upsampling': {'weight': 64, 'bias': -1}, 'synthesis': {'weight': 716, 'bias': 72}}
       ac_max_val_nn: 2772
   ac_max_val_latent: 19
       display_index: 0
         ------------------------
Frame decoding time: 3.704 sec
Total decoding time: 3.880 sec
theoladune commented 8 months ago

Hi,

Your decoding time seems to be a little bit slower than expected. On my Mac Book Pro M2 Max, it runs approximately 3 times faster, taking around 1 second to decode an image (see log below).

We're aware of the relatively slow decoding time due to the Python implementation. We're currently working on a fast C implementation of the decoder which should bring the decoding time to 100 ms on a single core CPU. This should be available in a few weeks :).

python3 src/decode.py -i results/image/kodak/bitstreams/kodim01-lmbda-0001.bin -o kodim01.png
Using 2 cpu cores

Content of the GOP header:
---------------------------
      n_bytes_header: 9
            img_size: (512, 768)
     frame_data_type: rgb
            bitdepth: 8
        intra_period: 0
            p_period: 0
         ---------------------
Intra period is 0 and P period is 0: all intra coding!
got -1: upsampling bias

Content of the frame header:
------------------------------
      n_bytes_header: 74
latent_n_resolutions: 7
    latent_n_2d_grid: 7
  n_bytes_per_latent: [34304, 3396, 556, 604, 276, 112, 40]
     n_ft_per_latent: [1, 1, 1, 1, 1, 1, 1]
 n_hidden_layers_arm: 2
             dim_arm: 24
upsampling_kernel_size: 8
static_upsampling_kernel: False
           flow_gain: 1
    layers_synthesis: ['40-1-linear-relu', '3-1-linear-relu', '3-3-residual-relu', '3-3-residual-none']
     q_step_index_nn: {'arm': {'weight': 1, 'bias': 3}, 'upsampling': {'weight': 6, 'bias': -1}, 'synthesis': {'weight': 6, 'bias': 5}}
      scale_index_nn: {'arm': {'weight': 26, 'bias': 22}, 'upsampling': {'weight': 30, 'bias': -1}, 'synthesis': {'weight': 40, 'bias': 40}}
          n_bytes_nn: {'arm': {'weight': 1020, 'bias': 40}, 'upsampling': {'weight': 64, 'bias': -1}, 'synthesis': {'weight': 716, 'bias': 72}}
       ac_max_val_nn: 2772
   ac_max_val_latent: 19
       display_index: 0
         ------------------------
Frame decoding time: 1.100 sec
Total decoding time: 1.235 sec
hytu99 commented 8 months ago

Thanks for your quick reply! Waiting for your faster implementation :)