samreenanjum / CTMC

2 stars 2 forks source link

Video decoding artifact #2

Closed ziw-liu closed 6 months ago

ziw-liu commented 7 months ago

Hi and thanks for making this dataset!

While using the video frames, the JPEG blocking artifact (due to 8x8 px compression patches) appears to be much stronger than the original video:

import matplotlib.pyplot as plt
from torchvision.io import read_image, read_video

frame = 20

# video frames in the CTMC dataset
jpg = read_image(f"CTMCV1/train/PL1Ut-run03/img1/0000{frame}.jpg")
# uses ffmpeg to read the original video from Nikon website
mp4, _, _ = read_video("PL1Ut-run03.MP4")

f, ax = plt.subplots(1, 2, figsize=(10, 5))
ax[0].imshow(jpg[0, :80, :80], cmap="gray")
ax[1].imshow(mp4[frame, :80, :80, 0], cmap="gray")

image

This will likely become a confounding factor for training vision models, since entire patches are smoothed out in the JPEG files. However I do not see such severe artifacts in the paper. If both training and testing are both done based on the original MPEG-4 stream, is the test result still valid?

samreenanjum commented 7 months ago

Hi, thanks for your interest!

I believe the test result should still be valid, as the annotations are based on the spatial coordinates of the image, and not dependent on the quality of the image. Please feel free to let me know if you would like to check further to suit your application. I hope that helps!