shimat / opencvsharp

OpenCV wrapper for .NET
Apache License 2.0
5.22k stars 1.13k forks source link

WarpAffine strange behaviour on NVIDIA #1586

Open laszloban opened 1 year ago

laszloban commented 1 year ago

Summary of your issue

Trying to apply WarpAffine transform - using UMats - to each frame of a video file and write it back to a new file. I see very strange behaviour using an NVidia RTX 3060, as a lot of frames are become the same as the previous one. Displaying the source frame with ImShow shows even the source frames are corrupted.

Environment

Laptop - Corei7 with Iris-XE, Windows 11 22H2 Desktop - Ryzen 5 (no gpu) with RTX3060, Windows 11 22H2

What did you do when you faced the problem?

My code works perfectly on my laptop with an Intel CPU's built-in Iris-XE gpu. When I move to an other machine with an NVidia 3060, I can successfully run the code, but a lot of frames are become the same as the previous one. Displaying the source frame with ImShow shows even the source frames are corrupted.

I've tried to investigate a bit, if I does not dispose the original UMat - by removing the using statement - I can get perfect result on both machines, however its leaking gpu memory.

I've tried to replace the WarpAffine with a simple copy for test purposes, and it was working perfectly on both machines.

Example code:

using OpenCvSharp;
using OpenCvSharp.XImgProc;

string inputFile  = "test.mp4";
string outputFile = "result.mp4";

using (VideoCapture videocapture = new VideoCapture(inputFile, VideoCaptureAPIs.FFMPEG))
{
    if (!videocapture.IsOpened()) { return; }

    VideoWriter videoWriter = new VideoWriter(outputFile,
        VideoCaptureAPIs.FFMPEG,
        VideoWriter.FourCC("mp4v"),
        videocapture.Fps,
        new OpenCvSharp.Size(videocapture.FrameWidth, videocapture.FrameHeight));

    int idx = 1;
    int frameCount = videocapture.FrameCount;
    int frameNumLength = frameCount.ToString("D").Length;

    Mat appliedTranslation = new(2, 3, MatType.CV_64FC1);
    appliedTranslation.At<double>(0, 0) = Math.Cos(0.1);
    appliedTranslation.At<double>(0, 1) = -Math.Sin(0.1);
    appliedTranslation.At<double>(1, 0) = Math.Sin(0.1);
    appliedTranslation.At<double>(1, 1) = Math.Cos(0.1);
    appliedTranslation.At<double>(0, 2) = 100;
    appliedTranslation.At<double>(1, 2) = -50;
    UMat translated= new UMat();

    using (Mat srcFrame = new Mat())
    {
        Mat frame; 
        while (videocapture.Read(srcFrame))
        {
            Console.WriteLine($"Processing {idx} from {videocapture.FrameCount}...");
            Cv2.ImShow("Result", srcFrame);

            //Doesn't work on NVidia, works perfectly on Intel
            using (UMat uImg = srcFrame.GetUMat(AccessFlag.READ, UMatUsageFlags.DeviceMemory))
            {
                Cv2.WarpAffine(uImg, translated, appliedTranslation, uImg.Size());
                frame = translated.GetMat(AccessFlag.RW);
            }

            //Works on both, but leaks gpu memory
            //UMat uImg = srcFrame.GetUMat(AccessFlag.READ, UMatUsageFlags.DeviceMemory);
            //Cv2.WarpAffine(uImg, translated, appliedTranslation, uImg.Size());
            //frame = translated.GetMat(AccessFlag.RW);

            //Works perfectly on both as expected
            //using (UMat uImg = srcFrame.GetUMat(AccessFlag.READ, UMatUsageFlags.DeviceMemory))
            //{
            //    uImg.CopyTo(translated);
            //    frame = translated.GetMat(AccessFlag.RW);
            //}

            videoWriter.Write(frame);
            Cv2.ImShow("Result", frame);
            Cv2.WaitKey(1);
            frame.Dispose();
            ++idx;
        }
        videocapture.Release();
        videoWriter.Release();
    }
}

Output:

What did you intend to be?

laszloban commented 1 year ago

After playing a bit more, I've found a workaround for the issue. It is working nicely both on intel and nvidia gpu's however it needs and extra, theoretically unnecessary copy of the image. This way I copy the source frame's UMat to a temp UMat, so I can dispose the source frame's UMat. Now I think the issue is maybe with the videocapture somehow.

UMat temp = new();
UMat translated= new UMat();

using (Mat srcFrame = new Mat())
{
    Mat frame; 
    while (videocapture.Read(srcFrame))
    {
        Console.WriteLine($"Processing {idx} from {videocapture.FrameCount}...");
        Cv2.ImShow("Result", srcFrame);

        using (UMat uImg = srcFrame.GetUMat(AccessFlag.READ, UMatUsageFlags.DeviceMemory))
            uImg.CopyTo(temp);
        Cv2.WarpAffine(temp, translated, appliedTranslation, temp.Size());
        frame = translated.GetMat(AccessFlag.RW);

        videoWriter.Write(frame);
        Cv2.ImShow("Result", frame);
        Cv2.WaitKey(1);
        frame.Dispose();
        ++idx;
    }
    videocapture.Release();
    videoWriter.Release();
}