NewChromantics / PopH264

Low-level, minimal H264 decoder & encoder library for windows, hololens/uwp, ios/tvos/macos, linux, android/quest/magic leap. CAPI for use with c#, unreal, swift
http://poph264.com/
Mozilla Public License 2.0
83 stars 15 forks source link

Performance and Color Issues with PopH264 Decoder in Unity - Windows+Hololens2 #83

Open Birkenpapier opened 8 months ago

Birkenpapier commented 8 months ago

Environment Unity Version: 2021.3.8f1 Platform: Win 11, HoloLens 2 PopH264 Version: prebuilt unitypackage 1.9

Description We have integrated the PopH264 decoder into our Unity project to decode and display H264 buffers received from a gRPC server. The setup involves decoding video frames that are 20kB per buffer, which may contain one or more frames.

We've encountered several issues:

image image

We have confirmed that the incoming H264 data is correctly encoded, as it has been successfully decoded using an alternative custom decoder based on DirectX. This suggests the issue lies within the PopH264 decoder or our implementation of the usage.

In Unity, we've added a GameObject with a MeshRenderer that uses a custom unlit shader to display the decoded texture. The H264Decoder script attached to this GameObject is responsible for decoding the frames and updating the texture.

the decoder integration which is attached to the plane gameobject:

` using UnityEngine; using System.Collections.Generic;

public class H264Decoder : MonoBehaviour { // Decoder instance private PopH264.Decoder decoder;

// Texture to display the decoded frame
private Texture2D decodedTexture;

// Use this for initialization
void Start()
{
    // Initialize the decoder with parameters (if needed)
    // PopH264.DecoderParams decoderParams = new PopH264.DecoderParams();

    PopH264.DecoderParams decoderParams = new PopH264.DecoderParams
    {
        AllowBuffering = false,
        DoubleDecodeKeyframe = true,
        DropBadFrames = true,
        VerboseDebug = true, // to get more logs which might help in debugging
    };

    // Customize decoderParams as needed
    decoder = new PopH264.Decoder(decoderParams, true);

    // Initialize the texture
    decodedTexture = new Texture2D(1920, 1088, TextureFormat.R8, false);
    // Attach the texture to a GameObject in your scene to display it
    // GetComponent<Renderer>().material.mainTexture = decodedTexture;

    var renderer = GetComponent<Renderer>();
    if (renderer != null)
    {
        renderer.material.SetTexture("_MainTex", decodedTexture);
    }
}

// This method should be called with incoming H264 buffer data
public void DecodeAndDisplayFrame(byte[] h264Data, int frameNumber)
{
    // Push frame data to the decoder
    decoder.PushFrameData(h264Data, frameNumber);

    // Prepare lists to get decoded frame data
    List<Texture2D> planes = new List<Texture2D>();
    List<PopH264.PixelFormat> pixelFormats = new List<PopH264.PixelFormat>();

    // Attempt to get the next frame
    var frameMeta = decoder.GetNextFrameAndMeta(ref planes, ref pixelFormats);

    if (frameMeta.HasValue)
    {
        // Update the texture with the decoded frame
        // Check and reinitialize the texture if necessary
        var metaPlane = frameMeta.Value.Planes[0];
        /*
        if (decodedTexture.width != metaPlane.Width || decodedTexture.height != metaPlane.Height || decodedTexture.format != TextureFormat.RGB24)
        {
            Debug.Log($"Recreating texture: width={metaPlane.Width}, height={metaPlane.Height}");
            decodedTexture = new Texture2D(metaPlane.Width, metaPlane.Height, TextureFormat.RGB24, false);
            GetComponent<Renderer>().material.mainTexture = decodedTexture;
        }
        */

        Debug.Log($"Recreating texture: width={metaPlane.Width}, height={metaPlane.Height}, format={metaPlane.Format}, pixelFormat={metaPlane.PixelFormat}");

        Graphics.CopyTexture(planes[0], decodedTexture);
    }
}

void OnDestroy()
{
    // Clean up the decoder
    if (decoder != null)
    {
        decoder.Dispose();
    }
}

} `

the receiving of the buffers from gRPC:

` private async UniTask ChangeFrameLoop(int frameNumber) { if (IsCancellationRequested) return;

if (_streamerClient == null)
{
    StreamFailed?.Invoke();
    return;
}

byte[] buffer;
try
{
    buffer = await _streamerClient.ReadFrameData(CancellationTokenNullable ?? CancellationToken.None);
}
catch
{
    StreamFailed?.Invoke();
    Dispose();
    return;
}

var frameLength = buffer.Length;
if (frameLength <= 10)
{
    return;
}

try
{
    if (IsCancellationRequested)
        return;

    h264Decoder.DecodeAndDisplayFrame(buffer, frameNumber);
}
catch
{
    StreamFailed?.Invoke();
    Dispose();
}

} `

the shader:

` Shader"Grayscale" { Properties { _MainTex ("Texture", 2D) = "white" {} } Subshader { Pass { Cull off CGPROGRAM

pragma vertex vertex_shader

        #pragma fragment pixel_shader
        #pragma target 2.0

sampler2D _MainTex;

struct custom_type { float4 vertex : SV_POSITION; float2 uv : TEXCOORD0; };

custom_type vertex_shader(float4 vertex : POSITION, float2 uv : TEXCOORD0) { custom_type vs; vs.vertex = UnityObjectToClipPos(vertex); vs.uv = uv; return vs; }

float4 pixel_shader(custom_type ps) : COLOR { float3 color = tex2D(_MainTex, ps.uv.xy).xyz; float grayscale = dot(color, float3(0.2126, 0.7152, 0.0722)); return float4(grayscale, grayscale, grayscale, 1.0); } ENDCG } } } `

Questions

Steps to Reproduce

Additional Context We've tried various settings for the decoder parameters, such as disabling buffering, double-decoding keyframes, and verbose debugging. The Unity shader is set up to convert from greyscale to RGB, assuming the issue might be related to color space conversion. The planes[0] array is already returning grayscale without specifiying the color space in the decoder setup. Attached is the H264Decoder script and the cleaned-up snippet from our gRPC frame handling logic.

We would appreciate any insights or guidance you can provide to help resolve these issues.

SoylentGraham commented 8 months ago

Have you tried decoding the same stream on windows or mac or any other platform? Is the performance similar?

Any chance you have a dump of the stream?

Birkenpapier commented 8 months ago

No not yet with your decoder. I only used ffmpeg in other envs/systems for decoding besides Unity with the custom implementation.

Here is a dump of ~13 secs save in a .bytes file. demofileRC1.zip

SoylentGraham commented 8 months ago

Oh as for colour; the decoder is probably outputting 2 or 3 planes (the output is almost always yuv) you're just using the luma/greyscale plane. Plane[1] will contain chromaU or interleaved chromaUV. Ill link to some yuv-> rgb shaders. Poph264 (c++ side) is actually outputting the colourspace (the colour conversion for yuv) but the c# code doesnt use it at the moment.

Im aware the new simple demo doesnt do conversion to rgb yet

Birkenpapier commented 8 months ago

Thank you very much for the information! I'll change that to the other plane element.

Understood. We have also a shader for our custom decoder to convert from yuv->rgb so no worries on that site.

Is everything so far good with the dump from your site? Do you need more information from us?

Perhaps did we something wrong during the implementation besides the greyscale colourspace which is bad in terms of latency and artifacts?

Thank you very, very much in advance and of course for the already listed tips!

SoylentGraham commented 8 months ago

It doesn't look too bad at a glance, but you're only checking for frames when you push data in....(and you're throwing away small buffers? why?) The library is more asynchronous than that.

Push data whenever you have it, and check for new frames GetNextFrameAndMeta every update

I won't have time to look at this for a while probably, but you could follow through this conversation https://github.com/NewChromantics/PopH264/issues/80 to get an idea of how to test what happens with your data; it's quite easy to test your data with the integration tests in the c++ app (which might also work on hololens, but an easy first step is to just try on desktop https://github.com/NewChromantics/PopH264/commit/0452ffbba65eb45f3ba5c056b4cf065171682417 )

Birkenpapier commented 8 months ago

Thank you very much once again for your quick response.

After some tweaking of my integration and digging deaper into your c# interface I've now this output: image

This is the adjusted H264Decoder.cs file which implements your PopH264.cs:

using UnityEngine;
using System.Collections.Generic;

public class H264Decoder : MonoBehaviour
{
    // Textures for Y and Chroma
    private Texture2D yTexture;
    private Texture2D chromaUVTexture;

    // Decoder instance
    private PopH264.Decoder decoder;

    // Texture to display the decoded frame
    private Texture2D decodedTexture;
    private Renderer rendererGO;

    // Use this for initialization
    void Start()
    {
        PopH264.DecoderParams decoderParams = new PopH264.DecoderParams
        {
            AllowBuffering = false,
            DoubleDecodeKeyframe = true,
            DropBadFrames = true,
            VerboseDebug = true,

            Decoder = "",
    };

        decoder = new PopH264.Decoder(decoderParams, true);

        // Initialize the texture
        decodedTexture = new Texture2D(1920, 1088, TextureFormat.R8, false);

        yTexture = new Texture2D(1920, 1088, TextureFormat.R8, false);
        chromaUVTexture = new Texture2D(960, 544, TextureFormat.RG16, false);

        rendererGO = GetComponent<Renderer>();
        if (rendererGO != null)
        {
            rendererGO.material.SetTexture("_YPlane", yTexture);
            rendererGO.material.SetTexture("_UVPlane", chromaUVTexture);
        }

    }

    // This method should be called with incoming H264 buffer data
    public void DecodeAndDisplayFrame(byte[] h264Data, int frameNumber)
    {
        // Push frame data to the decoder
        decoder.PushFrameData(h264Data, frameNumber);

        // Prepare lists to get decoded frame data
        List<Texture2D> planes = new List<Texture2D>();
        List<PopH264.PixelFormat> pixelFormats = new List<PopH264.PixelFormat>();

        // Attempt to get the next frame
        var frameMeta = decoder.GetNextFrameAndMeta(ref planes, ref pixelFormats);

        if (frameMeta.HasValue)
        {
            // Update the texture with the decoded frame
            // Check and reinitialize the texture if necessary
            var metaPlane = frameMeta.Value.Planes[1];

            Debug.Log($"Recreating texture: width={metaPlane.Width}, height={metaPlane.Height}, format={metaPlane.Format}, pixelFormat={metaPlane.PixelFormat} hwaccel={frameMeta.Value.HardwareAccelerated}" +
                $" planecount={frameMeta.Value.PlaneCount}");

            Graphics.CopyTexture(planes[0], yTexture);

            if (planes.Count > 1 && frameMeta.Value.Planes.Count > 1)
            {
                var chromaMeta = frameMeta.Value.Planes[1];
                if (chromaUVTexture.width != chromaMeta.Width || chromaUVTexture.height != chromaMeta.Height)
                {
                    chromaUVTexture = new Texture2D(chromaMeta.Width, chromaMeta.Height, TextureFormat.RG16, false);
                    var renderer = GetComponent<Renderer>();
                    rendererGO.material.SetTexture("_UVPlane", chromaUVTexture);
                }
            }

        }
    }

    void OnDestroy()
    {
        // Clean up the decoder
        if (decoder != null)
        {
            decoder.Dispose();
        }
    }
}

Unfortunately your shader did not worked so I copied the shader from MS WebRTC repo:

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License. See LICENSE in the project root for license information.

Shader "Video/YUVFeedClipped (unlit)"
{
    Properties
    {
        [Toggle(MIRROR)] _Mirror("Horizontal Mirror", Float) = 0
        [HideInEditor][NoScaleOffset] _YPlane("Y plane", 2D) = "white" {}
        [HideInEditor][NoScaleOffset] _UVPlane("UV plane", 2D) = "white" {}
        //[HideInEditor][NoScaleOffset] _VPlane("V plane", 2D) = "white" {}
        _Alpha("Alpha", Range(0.0, 1.0)) = 1.0
    }
    SubShader
    {
        Tags { "RenderType"="PostDepth" "Queue"="Transparent+1"}

        Pass
        {
            ZWrite On
            Blend SrcAlpha OneMinusSrcAlpha

            CGPROGRAM
            #pragma vertex vert
            #pragma fragment frag
            #pragma multi_compile_instancing
            #pragma multi_compile __ MIRROR

            #include "UnityCG.cginc"

            #define EPSILON       0.02
            #define BLACK_H264_Y  0.063
            #define BLACK_H264_UV 0.502

            struct appdata
            {
                float4 vertex : POSITION;
                float2 uv : TEXCOORD0;
                UNITY_VERTEX_INPUT_INSTANCE_ID
            };

            struct v2f
            {
                float2 uv : TEXCOORD0;
                float4 vertex : SV_POSITION;
                UNITY_VERTEX_OUTPUT_STEREO
            };

            v2f vert(appdata v)
            {
                v2f o;
                UNITY_SETUP_INSTANCE_ID(v);
                UNITY_INITIALIZE_OUTPUT(v2f, o);
                UNITY_INITIALIZE_VERTEX_OUTPUT_STEREO(o);
                o.vertex = UnityObjectToClipPos(v.vertex);
                o.uv = v.uv;
#if UNITY_UV_STARTS_AT_TOP
                o.uv.y = 1 - v.uv.y;
#endif
#ifdef MIRROR
                o.uv.x = 1 - v.uv.x;
#endif
                return o;
            }

            sampler2D _YPlane;
            sampler2D _UVPlane;
            //sampler2D _VPlane;
            float _Alpha;

            half3 yuv2rgb(half3 yuv)
            {
                // The YUV to RBA conversion, please refer to: http://en.wikipedia.org/wiki/YUV
                // Y'UV420p (I420) to RGB888 conversion section.
                half y_value = yuv[0];
                half u_value = yuv[1];
                half v_value = yuv[2];
                half r = y_value + 1.370705 * (v_value - 0.5);
                half g = y_value - 0.698001 * (v_value - 0.5) - (0.337633 * (u_value - 0.5));
                half b = y_value + 1.732446 * (u_value - 0.5);
                return half3(r, g, b);
            }

            inline fixed isBlackH264(half x, half y) {
                return (x < (y + EPSILON) && x >(y - EPSILON));
            }

            fixed isBlackH264_yuv(half3 yuv) {
                return isBlackH264(yuv.x, BLACK_H264_Y) && isBlackH264(yuv.y, BLACK_H264_UV) && isBlackH264(yuv.z, BLACK_H264_UV);
            }

            fixed luminance(half3 x) {
                return max(x.r, max(x.b, x.g));
            }

            fixed4 frag(v2f i) : SV_Target
            {
                //half2 uv = i.vertex.xy / _ScreenParams.xy;
                half3 yuv;
                yuv.x = tex2D(_YPlane, i.uv).r;
                yuv.yz = tex2D(_UVPlane, i.uv).rg;
                //yuv.z = tex2D(_VPlane, i.uv).r;
                half3 rgb = yuv2rgb(yuv);
                fixed isBlack = isBlackH264_yuv(yuv);
                rgb = rgb * !isBlack;
                return fixed4(rgb.rgb, !isBlack * _Alpha);
            }
            ENDCG
        }
    }
}

But this is not one of my biggest concerns. I logged the usage of the HW accelerated decoder but also this is set to false:

Recreating texture: width=960, height=544, format=ChromaUV_88, pixelFormat=ChromaUV_88 hwaccel=False planecount=2

I assume this is the reason for the big latency on the decoder site.

The other one is that there is no output at all on the HoloLens 2. The DLL gets loaded succesfully but there is now output from the DLL itself or my debug messages. Is the DLL build perhaps as debug?

May I ask you kindly if you have any other suggestions how we can continue now? Unfortunately this is blocking us from switching completely to your decoder implementation.

BR,

Kevin

SoylentGraham commented 8 months ago

May I ask you kindly if you have any other suggestions how we can continue now? Unfortunately this is blocking us from switching completely to your decoder implementation.

You're still not looking for new frames in quite the right way (I don't know how often you're getting data in, so not sure if this is the source of any problems...

This is what I wrote before

It doesn't look too bad at a glance, but you're only checking for frames when you push data in....(and you're throwing away small buffers? why?) The library is more asynchronous than that. Push data whenever you have it, and check for new frames GetNextFrameAndMeta every update

You're still only checking for frames when you push, instead of checking for new frames all the time

SoylentGraham commented 8 months ago
if (frameMeta.HasValue)
        {
            // Update the texture with the decoded frame
            // Check and reinitialize the texture if necessary
            var metaPlane = frameMeta.Value.Planes[1];

            Debug.Log($"Recreating texture: width={metaPlane.Width}, height={metaPlane.Height}, format={metaPlane.Format}, pixelFormat={metaPlane.PixelFormat} hwaccel={frameMeta.Value.HardwareAccelerated}" +
                $" planecount={frameMeta.Value.PlaneCount}");

            Graphics.CopyTexture(planes[0], yTexture);

            if (planes.Count > 1 && frameMeta.Value.Planes.Count > 1)
            {
                var chromaMeta = frameMeta.Value.Planes[1];
                if (chromaUVTexture.width != chromaMeta.Width || chromaUVTexture.height != chromaMeta.Height)
                {
                    chromaUVTexture = new Texture2D(chromaMeta.Width, chromaMeta.Height, TextureFormat.RG16, false);
                    var renderer = GetComponent<Renderer>();
                    rendererGO.material.SetTexture("_UVPlane", chromaUVTexture);
                }
            }

        }

You don't really need any of this... you're doing...

Just do (although this still isn't handling 3 planes, but in this case on this device it's okay)

 // hold onto textures for re-use
        List<Texture2D> planes = new List<Texture2D>();
        List<PopH264.PixelFormat> pixelFormats = new List<PopH264.PixelFormat>();

void Update()
{
        // look for new frames
        var frameMeta = decoder.GetNextFrameAndMeta(ref planes, ref pixelFormats);
        if (frameMeta.HasValue)
        {
                rendererGO.material.SetTexture("_YPlane", planes[0] );  
                rendererGO.material.SetTexture("_UVPlane", planes[1] );
        }
}
Birkenpapier commented 8 months ago

May I ask you kindly if you have any other suggestions how we can continue now? Unfortunately this is blocking us from switching completely to your decoder implementation.

You're still not looking for new frames in quite the right way (I don't know how often you're getting data in, so not sure if this is the source of any problems...

This is what I wrote before

It doesn't look too bad at a glance, but you're only checking for frames when you push data in....(and you're throwing away small buffers? why?) The library is more asynchronous than that. Push data whenever you have it, and check for new frames GetNextFrameAndMeta every update

You're still only checking for frames when you push, instead of checking for new frames all the time

Indeed, thank you very much for pointing that out again. I'll switch the logic and update you again when I've tested it.

Birkenpapier commented 8 months ago

Thank you very much! It worked with the colourspace: image

I switched the logic now as you described. We pushing now frames directly after arrival to the deocder but reading the decoded frames in the update method:

using UnityEngine;
using System.Collections.Generic;
using API.Utils;

public class H264Decoder : MonoBehaviour
{
    // Textures for Y and Chroma
    private Texture2D yTexture;
    private Texture2D chromaUVTexture;

    // Decoder instance
    private PopH264.Decoder decoder;

    private Renderer rendererGO;

    // Use this for initialization
    void Start()
    {
        // Initialize the decoder with parameters (if needed)
        // PopH264.DecoderParams decoderParams = new PopH264.DecoderParams();

        PopH264.DecoderParams decoderParams = new PopH264.DecoderParams
        {
            AllowBuffering = false,
            DoubleDecodeKeyframe = true,
            DropBadFrames = true,
            VerboseDebug = true, // to get more logs which might help in debugging

            Decoder = "",
        };

        // Customize decoderParams as needed
        decoder = new PopH264.Decoder(decoderParams, true);

        // Initialize the textures for Y and ChromaUV
        yTexture = new Texture2D(1920, 1088, TextureFormat.R8, false);
        chromaUVTexture = new Texture2D(960, 544, TextureFormat.RG16, false); // Assuming ChromaUV is half resolution

        rendererGO = GetComponent<Renderer>();
        if (rendererGO != null)
        {
            rendererGO.material.SetTexture("_YPlane", yTexture);
            rendererGO.material.SetTexture("_UVPlane", chromaUVTexture); // This should be the ChromaUV texture
        }
    }

    void Update()
    {
        DisplayFrame();
    }

    // This method should be called with incoming H264 buffer data
    public void DecodeAndDisplayFrame(byte[] h264Data, int frameNumber)
    {
        // Push frame data to the decoder
        decoder.PushFrameData(h264Data, frameNumber);
    }

    public void DisplayFrame()
    {
        try
        {
            // Prepare lists to get decoded frame data
            List<Texture2D> planes = new List<Texture2D>();
            List<PopH264.PixelFormat> pixelFormats = new List<PopH264.PixelFormat>();

            // Attempt to get the next frame
            var frameMeta = decoder.GetNextFrameAndMeta(ref planes, ref pixelFormats);

            if (frameMeta.HasValue)
            {
                // Update the texture with the decoded frame
                // Check and reinitialize the texture if necessary
                var metaPlane = frameMeta.Value.Planes[1];

                Debug.Log($"Recreating texture: width={metaPlane.Width}, height={metaPlane.Height}, format={metaPlane.Format}, pixelFormat={metaPlane.PixelFormat} hwaccel={frameMeta.Value.HardwareAccelerated}" +
                    $" planecount={frameMeta.Value.PlaneCount}");

                rendererGO.material.SetTexture("_YPlane", planes[0]);
                rendererGO.material.SetTexture("_UVPlane", planes[1]);
            }
        }
        catch
        {
            ApoDebug.LogError("[PopH264] DisplayFrame failed!");
        }
    }

    void OnDestroy()
    {
        // Clean up the decoder
        if (decoder != null)
        {
            decoder.Dispose();
        }
    }
}

Thank you very much for your help! Unfortunately it did not changed anything visible in terms of latency and also the hwaccel. is still enabled for some reason. Is there something we forgot to set?

The other one is that there is no output at all on the HoloLens 2. The DLL gets loaded succesfully but there is now output from the DLL itself or my debug messages. Is the DLL perhaps build as debug?

Regarding the usage on the HL2 may I kindly point the question above out to you.

SoylentGraham commented 8 months ago

The other one is that there is no output at all on the HoloLens 2. And no errors? Just never get a new frame? (It's possible there's errors in the frame, and no planes, so could you be accessing Planes[0] when the array is empty?)

Is the DLL perhaps build as debug? I'm not sure you'd be able to load it if it was the case. But you can easily build your own (Open the project and build! :) to check :) Does the code log the version number anywhere? (That'll clarify that the DLL is loading)

Unfortunately it did not changed anything visible in terms of latency and also the hwaccel. is still enabled for some reason. Is there something we forgot to set? Without trying it myself, I cant really guess, it could be a lot of things. If this is windows, maybe the decoder doesn't like the format of your H264. What resolution, profile, feature level are you encoding?

Does it run in the test app? https://github.com/NewChromantics/PopH264/issues/83#issuecomment-1911821147

Unless you run from the debugger/visual studio and are seeing a lot of error/debug messages in the output. (If you are, attach them here!) Is the app running slowly, or just decoding slow? (Are you talking about windows here, or HL2? They're not identical...)

For Hardware... it depends on MF providing a hardware decoder, what device are you decoding with? The frame meta should say the decoder being used.

Birkenpapier commented 8 months ago

We didn't tested it with your testapp yet. We'll do it now and update you with the results.

Unless you run from the debugger/visual studio and are seeing a lot of error/debug messages in the output. (If you are, attach them here!)

Is this referring to your testapp or to the unity integration?

The app itself is running pretty decent between 60-80 FPS: image We are referring to the Unity editor because as of now we don't get any images on the HL2. No error messages whatsoevery. The DLL is being loaded wihout any problems.

The decoder is the MediaFoundation one:

decoder=Microsoft H264 Video Decoder MFT

SoylentGraham commented 8 months ago

Unless you run from the debugger/visual studio and are seeing a lot of error/debug messages in the output. (If you are, attach them here!)

Is this referring to your testapp or to the unity integration?

Unity-built-app

SoylentGraham commented 8 months ago

The decoder is the MediaFoundation one: decoder=Microsoft H264 Video Decoder MFT

Okay, that's the software one, what GPU/H264 decoders do you have? (some nvidia cards don't provide a hardware decoder to mediafoundation) But I wouldn't worry about it, it'll still be faster than you need (Unless you have some whacky encoding it can't cope with :)

Birkenpapier commented 8 months ago

Unless you run from the debugger/visual studio and are seeing a lot of error/debug messages in the output. (If you are, attach them here!)

Is this referring to your testapp or to the unity integration?

Unity-built-app

I ran the build .sln from unity directly in VS2022 to have an attached debugger while runtime on the HL2. The only message I get in regard in PopH264 is this one:

'VSI_apoQlar.exe' (Win32): Loaded 'U:\Users\DefaultAccount\AppData\Local\DevelopmentFiles\VSIHoloLensVS.Release_ARM64.kpeivareh\PopH264.Uwp.dll'.

Nothing else which could pinpoint the problem why it is not showing/executing anything.

On the Win11 machine I have a NVIDIA GeForce RTX 4080 Laptop GPU.

The encoding process is being executed with this script:

#============================================================
# import packages
#============================================================
from grpc_prebuilt_files import streamerbackend_pb2_grpc as sbb_pb2_grpc, streamerbackend_pb2 as sbb_pb2
from grpc_prebuilt_files import stream_pb2_grpc, stream_pb2
from utils.detect_stream_sources import getAllCams
from utils.timestamp import get_current_time_with_milliseconds
from logging.handlers import RotatingFileHandler
import subprocess as sp
import threading
import logging
import signal
import grpc
import uuid
import time
import sys
import os

#============================================================
# class
#============================================================

#============================================================
# propertys
#============================================================
# logging
log_formatter = logging.Formatter('%(asctime)s %(levelname)s %(funcName)s(%(lineno)d) %(message)s')
logFile = '/home/apoqlar/logs/gRPC_stream.log'

my_handler = RotatingFileHandler(logFile, mode='a', maxBytes=5*1024*1024, 
                                 backupCount=2, encoding=None, delay=0)
my_handler.setFormatter(log_formatter)
my_handler.setLevel(logging.INFO)

app_log = logging.getLogger('root')
app_log.setLevel(logging.INFO)

app_log.addHandler(my_handler)
# sys.stdout.write = app_log.info
# sys.stderr.write = app_log.error
# end config logging

CHANGE_DEVICE = False

SECOND_SIZE = 0
TIC_REFERENCE = 0

STREAMING_STATUS = None

GUID = str(uuid.uuid4()) # generate uuid for angular app

IMG_W = 1280
IMG_H = 720

time.sleep(2) # let jetson to boot

# automatic detection if USB video source is connected
cams = getAllCams()
print(f"allCams: {cams}")

# get USB device link
# video_source = '/home/apoqlar/Downloads/output_long.mp4'
""""""
video_source = ''
if (len(cams["links"]) > 1):
    for i, cam in enumerate(cams['name']):
        if 'usb' in (cams['bus'][i]):
            video_source = cams['links'][i]
            print(f"USB cam: {cam, i} and adress: {video_source}")
            CHANGE_DEVICE = True
else:
    for i, cam in enumerate(cams['name']):
        if 'PCI' in (cams['bus'][i]):
            video_source = cams['links'][i]
            print(f"PCI cam: {cam, i} and adress: {video_source}")
            CHANGE_DEVICE = False
""""""

# video_source = '/dev/video1'

FFMPEG_BIN = "/home/apoqlar/apoqlar/libs/ffmpeg_6_0/ffmpeg"
# FFMPEG_BIN = "ffmpeg"
ffmpeg_cmd = [ FFMPEG_BIN,
            '-i', video_source,
            '-r', '30',
            '-pix_fmt', 'yuv420p',
            '-c:v', 'h264_nvmpi',
            # '-c:v', 'hevc_nvmpi',
            '-force_key_frames',
            'expr:gte(t,n_forced*0.5)',
            '-preset', 'ultrafast',
            '-threads', '16',
            '-vf', 'scale=1920:1080', # 1280x720
            # '-loglevel', 'error',
            '-an','-sn', # disable audio processing
            '-f', 'image2pipe', '-']

pipe = sp.Popen(ffmpeg_cmd, stdout = sp.PIPE, bufsize=10)
time.sleep(2) # let ffmpeg to warm up

#============================================================
# functions
#============================================================
def detect_devices():
    """detecting video devices on host system"""
    global video_source, CHANGE_DEVICE, ffmpeg_cmd, pipe

    cams = getAllCams()
    print(f"detected cams: {cams}")

    # if (len(cams) > 1) and (CHANGE_DEVICE is False):
    if (len(cams["links"]) > 1) and (CHANGE_DEVICE is False):
        pipe.kill()
        os.kill(os.getpid(),signal.SIGKILL)

        video_source = ''
        for i, cam in enumerate(cams['name']):
            if 'usb' in (cams['bus'][i]):
                video_source = cams['links'][i]
                print(f"USB cam: {cam, i} and adress: {video_source}")
                CHANGE_DEVICE = True
        # video_source = '/dev/video1'

        pipe.stdout.flush()

        ffmpeg_cmd = ffmpeg_cmd

        pipe = sp.Popen(ffmpeg_cmd, stdout = sp.PIPE, bufsize=10)
        time.sleep(2) # let ffmpeg to warm up
        CHANGE_DEVICE = True
    # elif (len(cams) <= 1) and (CHANGE_DEVICE is True):
    elif (len(cams["links"]) <= 1) and (CHANGE_DEVICE is True):
        pipe.kill()
        os.kill(os.getpid(),signal.SIGKILL)

        # TODO: remove the for loop due to len <= 1
        video_source = ''
        for i, cam in enumerate(cams['name']):
            if 'PCI' in (cams['bus'][i]):
                video_source = cams['links'][i]
                print(f"PCI cam: {cam, i} and adress: {video_source}")
                CHANGE_DEVICE = False
        # video_source = '/dev/video0'

        pipe.stdout.flush()

        ffmpeg_cmd = ffmpeg_cmd

        pipe = sp.Popen(ffmpeg_cmd, stdout = sp.PIPE, bufsize=10)
        time.sleep(2) # let ffmpeg to warm up
        CHANGE_DEVICE = False

def stream():
    global SECOND_SIZE, TIC_REFERENCE, STREAMING_STATUS, GUID, REC
    while True:
        tic = time.perf_counter()

        h265_buffer = pipe.stdout.read(20000) # 1280 * 720 * 3

        message_metadata = []
        meta_object = stream_pb2.VideoData.Metadata(type = '.JPEG', width = 1280, height = 720, fps = 30.0)
        message_metadata.append(meta_object)

        # data = stream_pb2.VideoData(buffer = h265_buffer, id = GUID, meta = iter(message_metadata))
        # timestamp integration; TODO: add extra field in stream.proto
        t_stamp = get_current_time_with_milliseconds()
        data = stream_pb2.VideoData(buffer = h265_buffer, id = t_stamp, meta = iter(message_metadata))

        # print("Sending data")
        yield data

        # calculate B/s
        old_tic = int(tic)
        try:
            SECOND_SIZE = SECOND_SIZE + sys.getsizeof(h265_buffer)
        except Exception as e:
            print(f"error: {e}")
            app_log.error(f"error: {e}")

        # SECOND_SIZE = SECOND_SIZE + sys.getsizeof(buf.tobytes())

        if old_tic > TIC_REFERENCE:
            print(f"size of one second: {SECOND_SIZE} bytes")
            app_log.info(f"size of one second: {SECOND_SIZE} bytes")
            TIC_REFERENCE = old_tic
            SECOND_SIZE = 0

            """"""
            if (TIC_REFERENCE % 10) == 0: # 10 seconds to wait/check
                app_log.info(f"looking for other video ressources")
                t_detect = threading.Thread(target=detect_devices)
                t_detect.daemon = True
                t_detect.start()
            """"""

def stream_info():
    # global STREAMING_STATUS

    data = sbb_pb2.StreamingStatusInfo(streaming = "yes", nclients = "2")
    # STREAMING_STATUS = data.order

    return data

def run():
    global STREAMING_STATUS

    channel = grpc.insecure_channel('127.0.0.1:50052')
    backend_channel = grpc.insecure_channel('127.0.0.1:11952')

    stub = stream_pb2_grpc.StreamerServiceStub(channel)
    backend_stub = sbb_pb2_grpc.StreamerBackendServerStub(backend_channel)
    print('SenderClient start')
    app_log.info(f"SenderClient start")

    try:
        responses = stub.LiveStreamVideo(stream())
        print("1")
        for res in responses:
            backend_responses = backend_stub.GetStreamingStatus(stream_info()) # to check every time the status and not once
            STREAMING_STATUS = backend_responses.order
            # print(res)

            if res is None:
                run()

            continue
    except grpc.RpcError as e:
        print("ERROR: ")
        print(e.details())
        app_log.error(f"error: {e.details()}")
        run() # if python backend server is down 

#============================================================
# Awake
#============================================================

#============================================================
# main
#============================================================
if __name__ == '__main__':
    run()

#============================================================
# after the App exit
#============================================================
# nothing to do here

May I provide you more information? Please ask for anything you think might be interested for debugging.

SoylentGraham commented 8 months ago

Try setting a profile & Level for h264 encoding to see if it makes any difference ffmpeg -profile:v main -level:v 3.0

Birkenpapier commented 8 months ago

Unfortunately it did not changed anything. But thank you very much for passing me this information.

SoylentGraham commented 8 months ago

Okay, well, when i get some free time, I'll take your stream dump and run it through the test app on a windows machine and see what it does. as per https://github.com/NewChromantics/PopH264/issues/83#issuecomment-1911821147

I don't have a hololens2 to hand any more (It went missing when I lent it to another company), so I cannot try it on hardware, but running the UWP test app on windows does behave differently to the proper win32 version

Birkenpapier commented 8 months ago

If I can help you with anything, please let me know. I have the proper hardware with me and can test anything you would want me to.

SoylentGraham commented 8 months ago

Yes, as I said a few times, just modify the test app! Like https://github.com/NewChromantics/PopH264/commit/0452ffbba65eb45f3ba5c056b4cf065171682417 put your data in there, then run it on windows, windows UWP, and on hololens

Birkenpapier commented 8 months ago

Yes, as I said a few times, just modify the test app! Like 0452ffb put your data in there, then run it on windows, windows UWP, and on hololens

Understood. Already in the making and will let you know. I deeply appreciate all your help already!