Getting frame data at viewer and feeding it to Rekognition

kpatil001 commented 4 years ago

The application which I am building requires video and audio to be streamed from frontend to backend ( which consist of various aws services). At backend, video frames should be fed to Rekognition to detect the emotion of user.

I am using amazon-kinesis-video-streams-webrtc-sdk-c to create viewer of signaling channel. The rekognition piece is exposed as a REST Service written in java, where I am using detectFaces Api to detect emotions and the viewer written on C will send the image data as a payload to the Rekognition service.

So how can I convert the frame data from C WebRTC Kinesis SDK viewer to the data format that can be consumed by Rekognition for emotion detection?

I am currently referring to kvsWebRTCClientViewer.c from the samples and using the same to create viewer.

Also, please through some light on how to extract audio from the signaling channel.

MushMal commented 4 years ago

@kpatil001 your question is more of a solution architecture question and has multiple solutions depending on your application and requirements.

Just to note that most of the webrtc clients would use some sort of a standard elementary stream underneath. Out-of-the-box we support h264 elementary stream in Annex-B format. The viewer then on frame callback would get the frame in Annex-B format.

Your application then will need to use Rekognition APIs to send the frames. I am not very familiar with it so others might comment. You could also look at some of the samples in Java and try to model it in your app.

Signaling channel is not used for the data/media streaming - it's for signaling. You need to use an audio transceiver for webrtc.

kpatil001 commented 4 years ago

Hi @MushMal Thanks for your response.

I am using amazon-kinesis-video-streams-webrtc-sdk-js at frontend to stream video and audio, so when I write viewer using C sdk, will framedata again contain h264 Annex-B data. I am asking this because I am not doing any h264 encoding at frontend.

At Rekognition end we need png or jpg image data in ByteArray or ByteBuffer. So currently, I am struggling to convert h264 Annex-B framedata to png or jpg byte array. So can you provide some sample or a few directions, on how can this be achieved.

MushMal commented 4 years ago

If your video is h264 and the rekognition requires a png image then you need to decode the H264 and get a set of bitmaps, then encode each bitmap you need to forward to a png format.

kpatil001 commented 4 years ago

Okay. Let me try that out.

Also, I was working on kvsWebRTCClientViewer.c provided in samples of this sdk and I am using kvsWebRTCClientMaster.c to stream data. Viewer file is callingsampleFrameHandler(UINT64 customData, PFrame pFrame) function onReceive of each frame. Inside same function I am writing the data obtained from pFrame using writeFile function

VOID sampleFrameHandler(UINT64 customData, PFrame pFrame)
{
    UNUSED_PARAM(customData);
    STATUS retStatus = STATUS_SUCCESS;
     DLOGV("Frame received. TrackId: %" PRIu64 ", Size: %u, Flags %u", pFrame->trackId, pFrame->size, pFrame->flags);
    printf("\n Frame received. TrackId: %" PRIu64 ", Size: %u, Flags %u", pFrame->trackId, pFrame->size, pFrame->flags$

    PSampleStreamingSession pSampleStreamingSession = (PSampleStreamingSession) customData;
    if (pSampleStreamingSession->firstFrame) {
        pSampleStreamingSession->firstFrame = FALSE;
        pSampleStreamingSession->startUpLatency = (GETTIME() - pSampleStreamingSession->offerReceiveTime) / HUNDREDS_O$
        printf("Start up latency from offer to first frame: %" PRIu64 "ms\n", pSampleStreamingSession->startUpLatency);
    }

         retStatus = writeFile("../outputImages/image.h264",TRUE,FALSE,pFrame->frameData,(UINT64)pFrame->size);
        printf("Trying to write file");
        if (retStatus != STATUS_SUCCESS) {
        printf("[KVS Master] readFile(): operation returned status code: 0x%08x \n", retStatus);
    }
}

Then I am using this image.h264 file as an input to ffmpeg and trying to convert it to png using

ffmpeg -i image.h264 -c:v h264 output1.png OR ffmpeg -i image.h264 -c:v libx264 output1.png

It raises following error

[h264 @ 0x7fd87f803800] Format h264 detected only with low score of 1, misdetection possible!

[h264 @ 0x7fd880000600] missing picture in access unit with size 160

[extract_extradata @ 0x7fd87ec03a00] No start code is found.

image.h264: could not find codec parameters

which indicates that NALU headers are not written in file. So is it possible that inside framedata NALU headers are not decoded and not correctly written. Or am I doing something wrong.

MushMal commented 4 years ago

hard to tell. You should try to open it in a binary editor and ensure what you are getting is a valid Annex-B format in the first place. I am not very familiar wit hffmpeg but it seems that it's able to parse out the NALus. It's chocking on a NALu with size 160 - not sure what this is - perhaps some rbsp or non-vls NALu. However, it later complains that it can't find the start code - perhaps Annex-B start code?

Try to dump out a single frame - a key frame for example. Run it under some tools to ensure it's valid. There must be some combination of ffmpeg parameters that could validate?

Also, think if you need to concatenate the frames into a single Annex-B file?

disa6302 commented 4 years ago

@kpatil001 ,

Any updates?

kpatil001 commented 4 years ago

Hi, It is resolved. Actually in sample viewer provided in your sdk. pAudioRtcRtpTransceiver is used initially. So the data which I was receiving at viewer end, it was actually opus audio frame data and not h264 image frame data. When I changed pAudioRtcRtpTransceiver to video Transceiver, then I was able to receive h264 frames and I was also able to convert keyframes to appropriate jpg and png files using ffmpeg.

However, converting of I-frames to proper jpg or png file is still something which I need to figure out

disa6302 commented 4 years ago

That is great! Since the original issue resolved, I will close this ticket. Feel free to reach out if you have any questions.

jadhu22 commented 3 years ago

Hi, It is resolved. Actually in sample viewer provided in your sdk. pAudioRtcRtpTransceiver is used initially. So the data which I was receiving at viewer end, it was actually opus audio frame data and not h264 image frame data. When I changed pAudioRtcRtpTransceiver to video Transceiver, then I was able to receive h264 frames and I was also able to convert keyframes to appropriate jpg and png files using ffmpeg.

However, converting of I-frames to proper jpg or png file is still something which I need to figure out

@kpatil001 How are you able to retrieve the images? While using ffmpeg -i image.h264 -c:v h264 output1.png OR ffmpeg -i image.h264 -c:v libx264 output1.png

I am getting the ERROR: Invalid data found when processing input

kpatil001 commented 3 years ago

Hi @jadhu22 I think the reason of this error is that not all the h264 frames which you receive are key frames. If you try to use any of the above mentioned commands on key frames then you will not get this error. If you try these commands on an I-frame, then these wont work. Reason being that I-frame does not contain data of entire image.

I remember using following commands on Key frames

ffmpeg -i image.h264 -c:v h264 output1.png and ffmpeg -i image.h264 outputImage.jpg

jadhu22 commented 3 years ago

@kpatil001 I get this, how do we get the key frame? Or what is a key frame here? Thank you for your prompt response!!

setareh-soltanieh commented 8 months ago

Hi @kpatil001,

I am trying to use the kvsWebRTCClientViewer.c to receive camera frames and I have added: CHK_STATUS(transceiverOnFrame(pSampleStreamingSession->pVideoRtcRtpTransceiver, (UINT64) pSampleStreamingSession, sampleFrameHandler)); to my kvsWebRTCClientViewer.c and also I have changed the sampleFrameHandler funciton like what you had but the writeFIle is not working. This is the log that Im having, [KVS Master] readFile(): operation returned status code: 0x00000009 is there any chance you know what is the issue?

niyatim23 commented 8 months ago

Hi @setareh-soltanieh, the error code 0x00000009 is STATUS_OPEN_FILE_FAILED. It is coming from here. Not sure how you're using files with the viewer though

setareh-soltanieh commented 8 months ago

Hey @niyatim23, thanks for getting back to me so quickly. I'm currently working on publishing the received camera frames to a rostopic. Here's what I've been up to:

I've successfully set up a ROS package for this project, and it's running smoothly.
Now, I've been trying to write a ROS node in my kvsWebRTCClientViewer, but since it's a C file, I hit a roadblock. Instead, I'm planning to send the camera frames to another file written in C++, where I can then publish them.
However, I'm encountering an issue with the format of the camera frames. I'm not very familiar with H264 format, and I need the frames to be in JPEG or PNG format so that I can publish them to sensor_msgs/Image.

niyatim23 commented 8 months ago

Hey @setareh-soltanieh, you're welcome. From what I understand, you're trying to write whatever you receive from the viewer into a file and seem to be facing issues with writeFile. The webrtc uses writeFile from a lower layer maintained by KVS. It can be found here. I would recommend making sure that the folder you're trying to write to exists and this application has the correct permissions to write to it. Try running it without the sample to make sure it works as expected if it helps.

setareh-soltanieh commented 8 months ago

Hi @niyatim23 I am trying to save the .h264 camera frames to my computer and convert it to .jpg using ffmpeg command, here is the code that I am usign for this:

VOID setFrameHandler(UINT64 customData, PFrame pFrame)
{
    UNUSED_PARAM(customData);
    STATUS retStatus = STATUS_SUCCESS;

    PSampleStreamingSession pSampleStreamingSession = (PSampleStreamingSession) customData;
    if (pSampleStreamingSession->firstFrame) {
        pSampleStreamingSession->firstFrame = FALSE;
        printf("Start up latency from offer to first frame: %" PRIu64 "ms\n", pSampleStreamingSession->startUpLatency);
        }

        retStatus = writeFile("temp_frame.h264", TRUE, FALSE, pFrame->frameData, (UINT64)pFrame->size);
        if (retStatus != STATUS_SUCCESS) {
            printf("[KVS Master] readFile(): operation returned status code: 0x%08x \n", retStatus);
        }
        // Constructing the ffmpeg command
        char ffmpegCommand[100];
        sprintf(ffmpegCommand, "ffmpeg -i temp_frame.h264 outputImage_%u.jpg", pFrame->index);

        // Executing the ffmpeg command
        system(ffmpegCommand);

        // Removing the temporary file
        remove("temp_frame.h264");
}

The problem here is that I am only able to convert the first frame to jpg because that's the key frame. Does anybody know how I can convert other frames (p-frames) to jpg files?

awslabs / amazon-kinesis-video-streams-webrtc-sdk-c

Getting frame data at viewer and feeding it to Rekognition #832