qingshi9974 / ICLR2024-FTIC

[ICLR2024] FTIC: Frequency-aware Transformer for Learned Image Compression
26 stars 1 forks source link

about the test #4

Closed Duanener closed 2 weeks ago

Duanener commented 1 month ago

What a wonderful work! However, I encounter some issues while running the eval.py.

When I use following command: python eval.py --data ./kodak --checkpoint ./ckpt_0483.pth --cuda --real I get following erro:

Loading ./ckpt_0483.pth
Traceback (most recent call last):
  File "/home/xxduan/FTIC/eval.py", line 169, in <module>
    main(sys.argv[1:])
  File "/home/xxduan/FTIC/eval.py", line 117, in main
    out_enc = net.compress(x_padded)
  File "/home/xxduan/FTIC/models/flic.py", line 505, in compress
    pmf = self._likelihood(samples, scale[0][c_idx][h_idx][w_idx],means=mu[0][c_idx][h_idx][w_idx]+minmax)
  File "/home/xxduan/FTIC/models/flic.py", line 528, in _likelihood
    scales = lower_bound(scales)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 80, in forward
    return self.lower_bound(x)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 75, in lower_bound
    return LowerBoundFunction.apply(x, self.bound)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 51, in forward
    return lower_bound_fwd(x, bound)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 37, in lower_bound_fwd
    return torch.max(x, bound)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

If I remove the --cuda from the command line, the program runs correctly, but slowly. Please how can I modify it so that it can run compress and decompress functions on the gpu.

qingshi9974 commented 2 weeks ago

What a wonderful work! However, I encounter some issues while running the eval.py.

When I use following command: python eval.py --data ./kodak --checkpoint ./ckpt_0483.pth --cuda --real I get following erro:

Loading ./ckpt_0483.pth
Traceback (most recent call last):
  File "/home/xxduan/FTIC/eval.py", line 169, in <module>
    main(sys.argv[1:])
  File "/home/xxduan/FTIC/eval.py", line 117, in main
    out_enc = net.compress(x_padded)
  File "/home/xxduan/FTIC/models/flic.py", line 505, in compress
    pmf = self._likelihood(samples, scale[0][c_idx][h_idx][w_idx],means=mu[0][c_idx][h_idx][w_idx]+minmax)
  File "/home/xxduan/FTIC/models/flic.py", line 528, in _likelihood
    scales = lower_bound(scales)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 80, in forward
    return self.lower_bound(x)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 75, in lower_bound
    return LowerBoundFunction.apply(x, self.bound)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/torch/autograd/function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 51, in forward
    return lower_bound_fwd(x, bound)
  File "/home/xxduan/.conda/envs/TCM/lib/python3.9/site-packages/compressai/ops/bound_ops.py", line 37, in lower_bound_fwd
    return torch.max(x, bound)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

If I remove the --cuda from the command line, the program runs correctly, but slowly. Please how can I modify it so that it can run compress and decompress functions on the gpu.

Thanks for the reminder, i've fixed the problem. The test code can now run correctly with --cuda .

In addition, we use the one-pass range coder of the [cheng2020] (https://github.com/ZhengxueCheng/Learned-Image-Compression-with-GMM-and-Attention/blob/master/network.py), which is slower than the two-pass range coder of compressai.

You can also implement the test code by removing --real, the results are almost the same with the real encoding and decoding process.

Duanener commented 2 weeks ago

Thank you for your kind reply. However, I wonder why we couldn't use two-pass range coder of compressai. Couldn't we use GaussianConditional to estimate the bpp of y_hat = Q(y), as I know the default y_hat= Q(y-mu) + mu.

qingshi9974 commented 2 weeks ago

Thank you for your kind reply. However, I wonder why we couldn't use two-pass range coder of compressai. Couldn't we use GaussianConditional to estimate the bpp of y_hat = Q(y), as I know the default y_hat= Q(y-mu) + mu.

Most of the entropy models in compressai usually center the representation with the mean prediction during quantization, thus encoding Q(y-mu). In this way, the Gaussian probability distribution is shifted to zero, so we can use the "scale table" to look up the cumulative probability directly and implement Ranger coding quickly.  However, in our autoregressive entropy models, this poses a problem because during training we do not know the mean before a pass through the model, so we would have to multiply the forward passes to get it. To simplify the setup, we follow [1,2,3] to drop this and encode Q(y) directly. So we don't use the range coder of compressai. In the extension of thi paper, we will try to modify the entropy coding process to make it faster.

[1] Xiang, Jinxi, Kuan Tian, and Jun Zhang. "Mimt: Masked image modeling transformer for video compression." The Eleventh International Conference on Learning Representations. 2023. [2] Mentzer, Fabian, et al. "VCT: A Video Compression Transformer." Advances in Neural Information Processing Systems. 2022. [3]Mentzer, Fabian, Eirikur Agustson, and Michael Tschannen. "M2t: Masking transformers twice for faster decoding." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

Duanener commented 2 weeks ago

Thanks a lot for your help!