encode_jpeg generates noise when processing 4k image

pytorch / vision

Datasets, Transforms and Models specific to Computer Vision

https://pytorch.org/vision

BSD 3-Clause "New" or "Revised" License

16.04k stars 6.93k forks source link

encode_jpeg generates noise when processing 4k image #8587

Open Lily-Git-hub opened 1 month ago

Lily-Git-hub commented 1 month ago

Hi I tried the latest torchvision 19.0 with pytorch2.4. I found that the encode_jpeg func had a problem when processing 4k image. For example, I have a 4K image tensor, which size is (3, 2160, 3840), then I use torchvision.io.encode_jpeg() in a loop, in the first loop, it can encode the tensor correctly. But in the following loops, it will generate jpeg image of noise only. Could you help with this please? Thanks!

NicolasHug commented 1 month ago

Hi @Lily-Git-hub can you please provide a minimal reproducing example? Thank you

Lily-Git-hub commented 1 month ago

Hi Nicolas,

Please try this example:

import torch 
import torchvision
import torch.nn.functional as F   

for i in range(2):
    image_data = torch.load('image_data.pt')
    resized_image_tensor = F.interpolate(image_data.unsqueeze(0), size=(2160, 3820), mode='bilinear', align_corners=False)
    image_data_resized = resized_image_tensor[0]
    image_data_encoded = torchvision.io.encode_jpeg( (image_data_resized).to(torch.uint8) )
    data = image_data_encoded.cpu().numpy().tobytes()
    with open(f'1.jpg', 'wb') as f:
        f.write(data)

    del data, image_data_encoded, resized_image_tensor, image_data_resized, image_data`

without the last line of code, which deleted the used variables, the saved image would be noise only. Please unzip the 'image_data.zip' to get image_data.pt.
[image_data.zip](https://github.com/user-attachments/files/16656971/image_data.zip)

NicolasHug commented 1 month ago

Sorry @Lily-Git-hub , I cannot reproduce your issue.

Lily-Git-hub commented 1 month ago

del data, image_data_encoded, resized_image_tensor, image_data_resized, image_data`

Hi Nicolas,

Did you remove the above line of code？ The error occurs when not deleting used variables. Thanks!

NicolasHug commented 1 month ago

Yes, I deleted these lines. Can you please provide a more minimal reproducing example, without a for loop, wihtout resizing, and from a normal image rather than from a pt file (which I won't load on my machine for security reaosns)

glazhh commented 1 week ago

I encountered a similar issue. I resolved it by adding torch.cuda.synchronize() before using encode_jpeg. It seems there might be some synchronization problems between F.interpolate and torchvision.io.encode_jpeg.

   resized_image_tensor = F.interpolate(image_data.unsqueeze(0), size=(2160, 3820), mode='bilinear', align_corners=False)
   image_data_resized = resized_image_tensor[0]
   # add synchronize after modified image and before encode jpeg
   torch.cuda.synchronize()
   image_data_encoded = torchvision.io.encode_jpeg( (image_data_resized).to(torch.uint8) )