Closed hytu99 closed 8 months ago
Hi,
Your decoding time seems to be a little bit slower than expected. On my Mac Book Pro M2 Max, it runs approximately 3 times faster, taking around 1 second to decode an image (see log below).
We're aware of the relatively slow decoding time due to the Python implementation. We're currently working on a fast C implementation of the decoder which should bring the decoding time to 100 ms on a single core CPU. This should be available in a few weeks :).
python3 src/decode.py -i results/image/kodak/bitstreams/kodim01-lmbda-0001.bin -o kodim01.png
Using 2 cpu cores
Content of the GOP header:
---------------------------
n_bytes_header: 9
img_size: (512, 768)
frame_data_type: rgb
bitdepth: 8
intra_period: 0
p_period: 0
---------------------
Intra period is 0 and P period is 0: all intra coding!
got -1: upsampling bias
Content of the frame header:
------------------------------
n_bytes_header: 74
latent_n_resolutions: 7
latent_n_2d_grid: 7
n_bytes_per_latent: [34304, 3396, 556, 604, 276, 112, 40]
n_ft_per_latent: [1, 1, 1, 1, 1, 1, 1]
n_hidden_layers_arm: 2
dim_arm: 24
upsampling_kernel_size: 8
static_upsampling_kernel: False
flow_gain: 1
layers_synthesis: ['40-1-linear-relu', '3-1-linear-relu', '3-3-residual-relu', '3-3-residual-none']
q_step_index_nn: {'arm': {'weight': 1, 'bias': 3}, 'upsampling': {'weight': 6, 'bias': -1}, 'synthesis': {'weight': 6, 'bias': 5}}
scale_index_nn: {'arm': {'weight': 26, 'bias': 22}, 'upsampling': {'weight': 30, 'bias': -1}, 'synthesis': {'weight': 40, 'bias': 40}}
n_bytes_nn: {'arm': {'weight': 1020, 'bias': 40}, 'upsampling': {'weight': 64, 'bias': -1}, 'synthesis': {'weight': 716, 'bias': 72}}
ac_max_val_nn: 2772
ac_max_val_latent: 19
display_index: 0
------------------------
Frame decoding time: 1.100 sec
Total decoding time: 1.235 sec
Thanks for your quick reply! Waiting for your faster implementation :)
Thanks for your efforts! I have used your script to decode the prepared bitstream
kodim01-lmbda-0001.bin
, and the reported time is about 3~4s. Is the result normal? I feel that the decoding speed is relatively slow, but I am uncertain about the underlying reasons.The CPU in use is Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz. And I run the script in my docker container.
Below is the original output for reference.