uber-research / jpeg2dct

Other
257 stars 44 forks source link

iDCT image reconstruction with scipy #2

Closed bdhammel closed 5 years ago

bdhammel commented 6 years ago

I'm trying to recover the original image starting with the returned DCT coefficients using scipy's idct method; however, I'm coming up with some discrepancies. I was hoping you could shed some light on how to correctly accomplish this? This is my attempt:


    def idct_2d(arr, type_=2):
        return idct(idct(arr, axis=0, norm='ortho', type=type_), axis=1, norm='ortho', type=type_)

    def test_macroblock_decode(image_path):
        dct_y, *_ = jpeg2dct.load(image_path)
        rows, cols, _ = dct_y.shape
        reconstructed_img = np.empty(shape=(8*rows, 8*cols))

        for j in range(rows):
            for i in range(cols):
                spectrogram = zigzag_reshape(dct_y[j, i])  # Convert 64 str of frequencies to 8x8 block
                macroblock = idct_2d(spectrogram, type_=2)
                reconstructed_img[8*j: 8*(j+1), 8*i: 8*(i+1)] = macroblock + 128

        plt.figure()
        plt.imshow(reconstructed_img)
        plt.show()
gueguenster commented 5 years ago

the order in which dct comes and the ones you need in idct_2 is different. You need to match them, in some way. One proposal is to pass the dct weights themselves through jpeg compression, and check wich frequency gets most exited.

augustelalande commented 5 years ago

It seems that jpeg2dct returns the frequencies in row major order and not in zigzag order. So replacing zigzag_reshape(dct_y[j, i]) with np.reshape(dct_y[j, i], (8,8)) gives the expected results.

As an aside, it would be nice to have a dct2rgb function as part of this package. This would be useful for generation tasks where someone might want to generate an image in dct space, but evaluate the results in rgb space.

HoracceFeng commented 4 years ago

Hi @bdhammel, recently I am trying to reimplement this repo on either cv2/numpy or scipy. However, I have no clues in how to start it. The result of cv2.dct() is totally different from this one. Would you have done this on scipy and would you share the code?

Thank you!

gueguenster commented 4 years ago

Hi, no we have not looked into cv2 or scipy. It might just be an ordering issue, as jpeg uses a non trivial ordering of the coefficients.

On Mon, Mar 23, 2020, 5:51 AM XavierFlow notifications@github.com wrote:

Hi @bdhammel https://github.com/bdhammel, recently I am trying to reimplement this repo on either cv2/numpy or scipy. However, I have no clues in how to start it. The result of cv2.dct() is totally different from this one. Would you have done this on scipy and would you share the code?

Thank you!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/uber-research/jpeg2dct/issues/2#issuecomment-602545611, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHWSNQQH2O7T3YKZZJG55TRI5ELTANCNFSM4FVRCCBQ .

bdhammel commented 4 years ago

Hey @HoracceFeng, I didn't have a problem getting it to match cv2 or scipy once I knew the unraveling order. It matches the default cv2 cv2.dct(arr) and scipy's type 2 method (https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.dct.html).

I'd have to dig some to find the code, will report back here when I come across it

bdhammel commented 4 years ago

Here's some code to play with, you'll see that in a 1:1 comparison the arrays do differ, but if you plot the image the differences are around the edges. It looks like one of the methods is slightly lossy and you lose some high-frequency components.

note, it's been over a year since I've thought about this, so don't assume my code to be correct.

good luck

from skimage import data
from skimage.color import rgb2ycbcr
from skimage.util import view_as_blocks
import matplotlib.pyplot as plt
from PIL import Image
from io import BytesIO
import numpy as np
# import numpy.testing as nptest
from jpeg2dct.numpy import loads
from scipy.fftpack import dct, idct  # noqa
import cv2

# Coefficient are _not_ in zigzag ordering
# https://github.com/kornelski/libjpeg/blob/master/libjpeg.doc#L2616
FLATTEN_KEY = np.arange(64).reshape((8, 8))

def get_data():
    img = data.astronaut()
    with BytesIO() as f:
        _img = Image.fromarray(img)
        _img.save(f, 'JPEG')
        dct_y, *_ = loads(f.getvalue(), normalized=True)

    return img, dct_y

def dct_2d(arr, type_=2):
    """A 2D DCT
    Convert a given spatial image to a 2d spectrogram using a dct transform
    """
    # return dct(dct(arr, axis=0, norm='ortho', type=type_), axis=1, norm='ortho', type=type_)
    return cv2.dct(arr)

def idct_2d(arr, type_=2):
    """A 2D iDCT
    Convert a given spatial image to a 2d spectrogram using a dct transform
    """
    # return idct(idct(arr, axis=0, norm='ortho', type=type_), axis=1, norm='ortho', type=type_)
    return cv2.idct(arr)

def flatten(arr):
    """Do a flattening on a 2d array"""

    arr = np.asarray(arr)
    if arr.shape != (8, 8):
        raise ValueError('Array needs to be macroblock of shape 8x8')

    rows, cols = (8, 8)
    flat_arr = np.empty(shape=(rows*cols))

    for j in range(rows):
        for i in range(cols):
            idx = FLATTEN_KEY[j, i]
            flat_arr[idx] = arr[j, i]

    return flat_arr

def reshape(arr):

    arr = np.asarray(arr)
    if arr.shape != (64,):
        raise ValueError('Array needs to be macroblock of shape 8x8')

    rows = cols = 8
    macroblock = np.empty(shape=(rows, cols))
    for j in range(rows):
        for i in range(cols):
            idx = FLATTEN_KEY[j, i]
            macroblock[j, i] = arr[idx]

    return macroblock

def flatten_to_dct_coef(dct_blocks):
    """For each macroblock spectrogram, flatten it to block_dims**2
    """
    rows, cols, *block_dims = dct_blocks.shape
    dct_coefs = np.zeros((rows, cols, np.prod(block_dims)))
    for j in range(rows):
        for i in range(cols):
            dct_coefs[j, i, :] = flatten(dct_blocks[j, i, :])

    return dct_coefs

def get_dct_blocks(arr):
    """Do DCT on macroblocks"""
    dct_arr = np.zeros_like(arr)
    rows, cols, *_ = arr.shape
    arr = arr - 128  # 0 mean center
    for j in range(rows):
        for i in range(cols):
            dct_arr[j, i, ...] = dct_2d(arr[j, i, ...])

    return dct_arr

def convert_to_macroblocks(image, block_size=(8, 8)):
    """Chop up the image into blocks of size `block_size`"""
    image = np.ascontiguousarray(image)
    return view_as_blocks(image, block_size)

def compress_to_dct(image):
    """Convert an image into DCT coefs
    Split the given image into 8x8 macro blocks and do a 2D dct transform on
    each macroblock. flatten this spectrogram such that the the channels of the
    returned image are 8**2
    i.e.
        image.shape == (h, w)
        dct_coefs.shape == (h//8, w//8, 64)

    NOTE
    ----
    Only supports images of h and w evenly divisible by 8
    """
    image = image.astype(np.float32)
    blocks_8x8 = convert_to_macroblocks(image)
    dct_blocks_8x8 = get_dct_blocks(blocks_8x8)
    dct_coefs = flatten_to_dct_coef(dct_blocks_8x8)
    return dct_coefs.astype(np.int16)

def test_view_spectrogram():
    img, tar_dct_y = get_data()
    ycbcr_ch1 = rgb2ycbcr(img)[..., 0]
    dct_y = compress_to_dct(ycbcr_ch1)

    plt.figure()
    spectrogram = reshape(dct_y[0, 0])
    plt.imshow(spectrogram)

    plt.figure()
    tar_spectrogram = reshape(tar_dct_y[0, 0])
    plt.imshow(tar_spectrogram)
    plt.show()

    # Will Fail
    # nptest.assert_array_equal(spectrogram, tar_spectrogram)

    plt.figure()
    plt.imshow(np.abs(tar_spectrogram - spectrogram))
    plt.show()

def test_macroblock_decode():

    img, dct_y = get_data()
    rows, cols, _ = dct_y.shape
    reconstructed_img = 1000 * np.ones(shape=(8 * rows, 8 * cols))

    for j in range(rows):
        for i in range(cols):
            spectrogram = reshape(dct_y[j, i])
            macroblock = idct_2d(spectrogram, type_=2) + 128
            reconstructed_img[8 * j: 8 * (j + 1), 8 * i: 8 * (i + 1)] = macroblock

    plt.figure()
    plt.title("Reconstruction from jpeg2dct")
    plt.imshow(reconstructed_img)
    plt.show()

    ycbcr_ch1 = rgb2ycbcr(img)[..., 0]
    # Will Fail
    # nptest.assert_array_equal(reconstructed_img, ycbcr_ch1)
    plt.figure()
    plt.title("Difference")
    plt.imshow(np.abs(reconstructed_img - ycbcr_ch1))
    plt.show()
HoracceFeng commented 4 years ago

Hi Ben, Thank you for your help, will try~


发件人: Ben Hammel notifications@github.com 发送时间: 2020年3月29日 0:04 收件人: uber-research/jpeg2dct jpeg2dct@noreply.github.com 抄送: XavierFlow haonanfengus@outlook.com; Mention mention@noreply.github.com 主题: Re: [uber-research/jpeg2dct] iDCT image reconstruction with scipy (#2)

Here's some code to play with, you'll see that in a 1:1 comparison the arrays do differ, but if you plot the image the differences are around the edges. It looks like one of the methods is slightly lossy and you lose some high-frequency components.

note, it's been over a year since I've thought about this, so don't assume my code to be correct.

good luck

from skimage import data from skimage.color import rgb2ycbcr from skimage.util import view_as_blocks import matplotlib.pyplot as plt from PIL import Image from io import BytesIO import numpy as np

import numpy.testing as nptest

from jpeg2dct.numpy import loads from scipy.fftpack import dct, idct # noqa import cv2

FLATTEN_KEY = np.arange(64).reshape((8, 8))

def get_data(): img = data.astronaut() with BytesIO() as f: _img = Image.fromarray(img) _img.save(f, 'JPEG') dcty, * = loads(f.getvalue(), normalized=True)

return img, dct_y

def dct2d(arr, type=2): """A 2D DCT Convert a given spatial image to a 2d spectrogram using a dct transform """

return dct(dct(arr, axis=0, norm='ortho', type=type), axis=1, norm='ortho', type=type)

return cv2.dct(arr)

def idct2d(arr, type=2): """A 2D iDCT Convert a given spatial image to a 2d spectrogram using a dct transform """

return idct(idct(arr, axis=0, norm='ortho', type=type), axis=1, norm='ortho', type=type)

return cv2.idct(arr)

def flatten(arr): """Do a zigzag flattening on a 2d array"""

arr = np.asarray(arr)
if arr.shape != (8, 8):
    raise ValueError('Array needs to be macroblock of shape 8x8')

rows, cols = (8, 8)
flat_arr = np.empty(shape=(rows*cols))

for j in range(rows):
    for i in range(cols):
        idx = FLATTEN_KEY[j, i]
        flat_arr[idx] = arr[j, i]

return flat_arr

def reshape(arr):

arr = np.asarray(arr)
if arr.shape != (64,):
    raise ValueError('Array needs to be macroblock of shape 8x8')

rows = cols = 8
macroblock = np.empty(shape=(rows, cols))
for j in range(rows):
    for i in range(cols):
        idx = FLATTEN_KEY[j, i]
        macroblock[j, i] = arr[idx]

return macroblock

def flatten_to_dct_coef(dct_blocks): """For each macroblock spectrogram, flatten it to block_dims*2 """ rows, cols, block_dims = dct_blocks.shape dct_coefs = np.zeros((rows, cols, np.prod(block_dims))) for j in range(rows): for i in range(cols): dct_coefs[j, i, :] = flatten(dct_blocks[j, i, :])

return dct_coefs

def get_dct_blocks(arr): """Do DCT on macroblocks""" dct_arr = np.zeroslike(arr) rows, cols, * = arr.shape arr = arr - 128 # 0 mean center for j in range(rows): for i in range(cols): dct_arr[j, i, ...] = dct_2d(arr[j, i, ...])

return dct_arr

def convert_to_macroblocks(image, block_size=(8, 8)): """Chop up the image into blocks of size block_size""" image = np.ascontiguousarray(image) return view_as_blocks(image, block_size)

def compress_to_dct(image): """Convert an image into DCT coefs Split the given image into 8x8 macro blocks and do a 2D dct transform on each macroblock. flatten this spectrogram such that the the channels of the returned image are 8**2 i.e. image.shape == (h, w) dct_coefs.shape == (h//8, w//8, 64)

NOTE
----
Only supports images of h and w evenly divisible by 8
"""
image = image.astype(np.float32)
blocks_8x8 = convert_to_macroblocks(image)
dct_blocks_8x8 = get_dct_blocks(blocks_8x8)
dct_coefs = flatten_to_dct_coef(dct_blocks_8x8)
return dct_coefs.astype(np.int16)

def test_view_spectrogram(): img, tar_dct_y = get_data() ycbcr_ch1 = rgb2ycbcr(img)[..., 0] dct_y = compress_to_dct(ycbcr_ch1)

plt.figure()
spectrogram = reshape(dct_y[0, 0])
plt.imshow(spectrogram)

plt.figure()
tar_spectrogram = reshape(tar_dct_y[0, 0])
plt.imshow(tar_spectrogram)
plt.show()

# Will Fail
# nptest.assert_array_equal(spectrogram, tar_spectrogram)

plt.figure()
plt.imshow(np.abs(tar_spectrogram - spectrogram))
plt.show()

def test_macroblock_decode():

img, dct_y = get_data()
rows, cols, _ = dct_y.shape
reconstructed_img = 1000 * np.ones(shape=(8 * rows, 8 * cols))

for j in range(rows):
    for i in range(cols):
        spectrogram = reshape(dct_y[j, i])
        macroblock = idct_2d(spectrogram, type_=2) + 128
        reconstructed_img[8 * j: 8 * (j + 1), 8 * i: 8 * (i + 1)] = macroblock

plt.figure()
plt.title("Reconstruction from jpeg2dct")
plt.imshow(reconstructed_img)
plt.show()

ycbcr_ch1 = rgb2ycbcr(img)[..., 0]
# Will Fail
# nptest.assert_array_equal(reconstructed_img, ycbcr_ch1)
plt.figure()
plt.title("Difference")
plt.imshow(np.abs(reconstructed_img - ycbcr_ch1))
plt.show()

― You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/uber-research/jpeg2dct/issues/2#issuecomment-605536136, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AE6372MG7BSLJBAHLLVOKLDRJ2GBBANCNFSM4FVRCCBQ.