Consistent error "Failed to submit frame to Kinesis Video client. status: 0x32000002" on Raspberry Pi

Shumakriss commented 6 years ago

Hello and thank you for this awesome tool, it's helping my project come to life!

When running the GST video demo, I receive an error continuously on my Raspberry Pi. "Failed to submit frame to Kinesis Video client. status: 0x32000002". I wasn't sure if I would file a ticket for this but I am stumped and I know the code was just published in the last few months so maybe I am not alone. Any recommendations would be appreciated!

Edit: I should mention I am using this with Rekognition and hopefully the web preview tool in the web console. It looks like the RPi is still using h264 but I'm not familiar with the specific encoder selected by the code for RPi.

I have gone through some steps to fix heap allocation size and to make sure /dev/video0 is available. I have also built with debug logs but not since this issue. I can rerun with debug info soon (need to recompile on the RPi, takes awhile). I have this demo working fine on two Mac's but recently got it compiling on the RPi.

I am running with this command: AWS_ACCESS_KEY_ID= AWS_SECRET_ACCESS_KEY= AWS_DEFAULT_REGION= ./kinesis_video_gstreamer_sample_app from the kinesis-video-native-build directory.

I believe this error message matches from this page: https://docs.aws.amazon.com/kinesisvideostreams/latest/dg/producer-sdk-errors.html

That page says this is an MKVGen error "STATUS_MKV_INVALID_FRAME_TIMESTAMP"

Raspberry Pi B+
<512MB RAM, tweaked kinesis_video_gstreamer_sample_app.cpp to use 256MB

PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)" NAME="Raspbian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=raspbian ID_LIKE=debian HOME_URL="http://www.raspbian.org/" SUPPORT_URL="http://www.raspbian.org/RaspbianForums" BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Linux raspberrypi 4.14.18+ #1093 Fri Feb 9 15:07:36 GMT 2018 armv6l GNU/Linux

dpkg --get-selections | grep -i x264 libx264-148:armhf install libx264-dev:armhf install x264 install

dpkg --get-selections | grep -i gstream gir1.2-gstreamer-1.0 install gstreamer1.0-alsa:armhf install gstreamer1.0-libav:armhf install gstreamer1.0-omx install gstreamer1.0-omx-rpi install gstreamer1.0-omx-rpi-config install gstreamer1.0-plugins-bad:armhf install gstreamer1.0-plugins-base:armhf install gstreamer1.0-plugins-base-apps install gstreamer1.0-plugins-good:armhf install gstreamer1.0-plugins-ugly:armhf install gstreamer1.0-rtsp:armhf install gstreamer1.0-tools install gstreamer1.0-x:armhf install libgstreamer-plugins-bad1.0-0:armhf install libgstreamer-plugins-bad1.0-dev install libgstreamer-plugins-base1.0-0:armhf install libgstreamer-plugins-base1.0-dev install libgstreamer1.0-0:armhf install libgstreamer1.0-dev install

Thanks again!

MushMal commented 6 years ago

@Shumakriss this error is caused when you have frame timeline overlap. For example: Dts(N) + Duration(N) > Dts(N+1). This means that the Nth frame decoding timestamp + the frame's duration is greater than the next frames duration.

The demo application is configured to have frame duration of 20 milliseconds. While I am not sure why this is an issue in your case (perhaps an unstable encoder issue), could I ask you to lower the value of the duration to say 10 milliseconds and retry? Please include a little more information whether this happens with every frame or whether this is an intermittent issue which can be an indication of encoder stability. Could you also dump out the timestamps?

The function you need is

void create_kinesis_video_frame(Frame frame, const nanoseconds &pts, const nanoseconds &dts, FRAME_FLAGS flags, void data, size_t len) { frame->flags = flags; frame->decodingTs = static_cast(dts.count()) / DEFAULT_TIME_UNIT_IN_NANOS; frame->presentationTs = static_cast(pts.count()) / DEFAULT_TIME_UNIT_IN_NANOS; frame->duration = 20 * HUNDREDS_OF_NANOS_IN_A_MILLISECOND; frame->size = static_cast(len); frame->frameData = reinterpret_cast(data); }

Please include some instrumentation like: LOG_DEBUG("frame dts: " << frame->decodingTs << " pts: " << frame->presentationTs);

Regards, Mushegh

MushMal commented 6 years ago

Forgot to mention, GStreamer might return a sentinel value for an invalid timestamp in which case the frame timestamps will be the same causing the error you've mentioned. Dumping out the values of the timestamps will help us to debug it further.

unicornss commented 6 years ago

Hi @Shumakriss are you able to send streams from your PI device?

Shumakriss commented 6 years ago

Hi @MushMal and @unicornss thank you for your replies! Sorry for the slow response.

I had to make some hardware changes for my project and cannot repeat this exact issue now. Currently, I am working through issue #18. Looks like my camera is not registering correctly on the new install.

If I get the same error, I will try out the function @MushMal sent as well as add some debug logs for the timestamps.

unicornss commented 6 years ago

Looks like my camera is not registering correctly

Could you elaborate on this? Adding logs will help us to debug your issue.

Can you also try the steps https://github.com/awslabs/amazon-kinesis-video-streams-producer-sdk-cpp/issues/16 in case if the driver is not loaded properly in your custom Raspberry PI hardware?

Shumakriss commented 6 years ago

Hi @unicornss, it turns out that the SUNNY connector on my picamera was loose. What I meant by that statement was that the commands listed for troubleshooting:

vcgencmd get_camera ls /dev/video*

were showing that the camera was not detected, even after a firmware update via rpi-update and a full system update with apt.

I did some more reading and fixed the SUNNY connector and the camera is detected. When I reported this issue, it was detected on that setup as well. I am trying to reproduce this issue but I am stuck on #18. I have recompiled with the debug logs as well, so if I do run into this after fixing #18, I will have them ready.

Thanks for your help!

Shumakriss commented 6 years ago

Recompiled with debug logging but I am still reading through them myself. I will upload them for expediency. Error occurs on line 284.

output.txt

Shumakriss commented 6 years ago

I have fixed one problem which wasn't really related to issue 18 after all, just the same symptoms. However, that makes this two boards/installations that I get this error. Digging into the logs, I can only see this error printed one time. I don't see anything in the AWS console stream monitor or in the parser library sample app (which works when I use the gstreamer sample on OSX instead of RPi).

Edit: I only recompiled with the HEAP_DEBUG and LOG_STREAMING definitions. I don't have the function you sent nor do I have any timestamps logged. I will do that shortly.

Shumakriss commented 6 years ago

Actually, @MushMal, I am not sure which value you are recommending for the frame duration: 10 milliseconds or 20? You said the default was 20 and I should try lowering to 10 but the function you sent was 20. I assume you were just pointing to the relevant function but when I looked it up, I found it was already set to 10. I am going to give 20 a shoot, just to see what happens. Still need to add timestamp logs, will do soon!

Shumakriss commented 6 years ago

@MushMal I tried both 10 and 20 with the same result. However, it looks like this only happens once shortly after startup. The application runs fine and even logs that it's writing 16372 bytes at a time to Kinesis Video. So, perhaps that is not the issue. However, my console and the parser-library demo app remain black screens with the gstreamer demo on raspberry pi unlike on OSX.

That seems like a separate symptom/issue to me. output-2.txt

Shumakriss commented 6 years ago

Added debugging for timestamps. output-3.txt

masda70 commented 6 years ago

Hello, I've been encountering this error as well. I keep the application constantly running and the error seems to occur every business day around a particular time.

I'm using the latest version (https://github.com/awslabs/amazon-kinesis-video-streams-producer-sdk-cpp/commit/4b47f63f636cff6106c6079e319e9d0c3248c087).

Since the log grows fast, I haven't kept it all and I can't tell you what exactly is logged when the application breaks, but I know that once it breaks it keeps showing the following error after displaying getKinesisVideoMetrics:

ERROR - Failed to submit frame to Kinesis Video client. status: 0x32000002 decoding timestamp: 652886581679 presentation timestamp: 652885914643
Dropped frame!

Followed by a sequence of the following:

viewItemRemoved(): Reporting a dropped frame/fragment.WARN - Reporting dropped frame. Frame timecode 651685460000

MushMal commented 6 years ago

Sorry for the delay - have been out.

@Shumakriss, going through the logs. The first log output.txt

You have a timeout which was recovered after an automatic retry - please make sure your network is properly configured and you have the adequate bandwidth
You are getting 0x32000011 - this is an indication that your media pipeline produces a key frame (an Idr frame) with different presentation timestamp and decoding timestamp. This is indicative of an encoder issue. We will try to look further into it.
You are getting MAX_FRAGMENT_DURATION_REACHED error ACK returned. This is again an indication that your encoder is not properly operating.

output-2.txt

Same 0x32000011 error
The logs are short so I don't see any ACKs coming after the buffering

output-3.txt

Same 0x32000011 error
As above, the logs are short so I don't see any further ACKs.

In conclusion, your encoder doesn't produce proper key frames (Idr frames) so the SDK fails to submit the key frame.

We can do some research and get back to you on what could potentially be wrong and some suggestions to try shortly

MushMal commented 6 years ago

@masda70 could I ask you to open a separate issue to track - yours seem to be a different issue.

Please include the full logs and some description of your scenario - what are you trying to achieve after the stream has been ingested.

masda70 commented 6 years ago

@MushMal Ok: https://github.com/awslabs/amazon-kinesis-video-streams-producer-sdk-cpp/issues/42

Shumakriss commented 6 years ago

@MushMal @unicornss

I want to stream audio, video, and sensor data from my Raspberry Pi. I just recently found this thread and now I am questioning my choice of Kinesis Video. I am using KVS with Rekognition though.

I could probably handle sensor data on Kinesis data streams but not sure what to do about audio. I don't need ABR or multi-choice support. Any recommendations?

MushMal commented 6 years ago

@Shumakriss could you please elaborate. What's your scenario - what are you trying to accomplish? As I mentioned in that thread, the modeled behavior we are trying to push is to use separate streams for different types of data. You could create a video stream and an audio stream. Those are entirely independent but the frame timestamps are aligned. This is the key point to any A/V sync or sensor fusion solutions.

As I mentioned in the thread, if you for some reason need each video frame to be accompanied by an audio frame and a sensor sample then you can package those in your own custom frame format. Please bare in mind that the KVS console playback will only work for h264 streams.

Shumakriss commented 6 years ago

@MushMal Thanks for your help. I'm building a bot to interact with developers based on build status. I use Rekognition to know when I have found the developer who broke the build. It is a fun way to promote the "visualize the build" concept. I plan to make it more general purpose, make it a telepresence device, and eventually work on other AI tasks. I can handle my original use case with just video but now I am ready to move to the next steps.

The old version was actually a testbed for me to learn Kafka: https://github.com/Shumakriss/build_butler-2.0 and I switched to AWS to learn about the AWS platform and improve the "product."

MushMal commented 6 years ago

Sounds quite interesting. I do strongly encourage the architectural pattern I suggested - use separate streams to carry different types of data. Here are the few advantages:

Different frame rate/settings/tolerances/priorities of different streams.
Different retention periods
Some devices/clients can produce some streams while others can produce all of them - your solution will check for presence of the stream data on the consumer end.
Consumer stream prioritization - in low-bandwidth scenario your consumer application could choose to stop pulling the high-density or low-priority stream. For example, in case of Audio/Video playback, Audio has far higher priority than video and it's far less dense. The playback heuristics engine would prioritize the audio fragments over the video.
Forward compatibility of your solution.
Downstream stream specific consumers would just consume the streams they are interested in only.

...

Shumakriss commented 6 years ago

I may have more information. I am able to run the stream parser library demo . When I uncomment @Ignore and stream from my Mac, I can see the image in the demo app as well as in the AWS console. However, when I stream from the RPi, I catch a Throwable in GetMediaWorker. It appears there is an issue decoding the stream which I think adds merit to your theory about the encoder setup.

INFO: {kinesisvideo, us-east-1} was not found in region metadata, trying to construct an endpoint using the standard pattern for this region: 'kinesisvideo.us-east-1.amazonaws.com'. java.lang.ArrayIndexOutOfBoundsException [ERROR] 2018-03-06 21:10:31.033 [pool-2-thread-1] GetMediaWorker - Failure in GetMediaWorker for streamName my_stream java.lang.ArrayIndexOutOfBoundsException

at org.jcodec.codecs.h264.decode.SliceDecoder.putMacroblock(SliceDecoder.java:198)
at org.jcodec.codecs.h264.decode.SliceDecoder.decodeMacroblocks(SliceDecoder.java:104)
at org.jcodec.codecs.h264.decode.SliceDecoder.decodeFromReader(SliceDecoder.java:73)
at org.jcodec.codecs.h264.H264Decoder$FrameDecoder.decodeFrame(H264Decoder.java:152)
at org.jcodec.codecs.h264.H264Decoder.decodeFrameFromNals(H264Decoder.java:103)
at com.amazonaws.kinesisvideo.parser.examples.KinesisVideoRendererExample$ParsingVisitor.visit(KinesisVideoRendererExample.java:183)
at com.amazonaws.kinesisvideo.parser.mkv.MkvDataElement.accept(MkvDataElement.java:127)
at com.amazonaws.kinesisvideo.parser.mkv.visitors.CompositeMkvElementVisitor.visitAll(CompositeMkvElementVisitor.java:63)
at com.amazonaws.kinesisvideo.parser.mkv.visitors.CompositeMkvElementVisitor.visit(CompositeMkvElementVisitor.java:54)
at com.amazonaws.kinesisvideo.parser.mkv.MkvDataElement.accept(MkvDataElement.java:127)
at com.amazonaws.kinesisvideo.parser.examples.GetMediaWorker.run(GetMediaWorker.java:73)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

There is a Raspberry Pi tutorial .

I have attached my libraries for reference. libraries.txt

Update 1: The stack trace is not 100% consistent. It is frequent and seems to be more likely to happen the more recently the producer was started and decreases in frequency over time.

Update 2: I tried using x264enc gst element factory instead of omxh264enc but got the same results.

Update 3: Found that Java parser demo gets EOF after running the gstreamer demo for awhile. Gstreamer demo starts to drop frames consistently after 5-10 minutes or so. dropped-frames.txt The parser does seem to fail consistently after the first simple block it tries to decode. I turned up the log level to debug and extended jcodec with some extra logs. parser-logs.txt. Lastly, I have found that gstreamer does log two warnings: 0:00:03.281817108 4807 0xe46860 WARN v4l2bufferpool gstv4l2bufferpool.c:748:gst_v4l2_buffer_pool_start:<source:pool:src> Uncertain or not enough buffers, enabling copy threshold 0:00:03.691384799 4807 0xe46860 WARN v4l2bufferpool gstv4l2bufferpool.c:1201:gst_v4l2_buffer_pool_dqbuf:<source:pool:src> Driver should never set v4l2_buffer.field to ANY I don't really know too much about encoding/decoding but I will keep you posted when I figure something out.

Shumakriss commented 6 years ago

I reconstructed a gst-launch test after tracing the code. I have verified that I am using the correct elements in gst-launch-1.0 -v -e v4l2src do-timestamp=true device=/dev/video0 ! videoconvert ! 'video/x-raw,width=640,height=480,framerate=15/1,format=I420' ! omxh264enc control-rate=1 target-bitrate=5120000 ! h264parse ! 'video/x-h264,stream-format=avc,alignment=au,width=640,height=480,framerate=15/1,profile=baseline' ! matroskamux ! filesink location=test.mkv which produces a valid MKV that I can view in VLC. To me that validates that the pipeline is valid and functional and that the code should be generating that pipeline.

Any recommendations on how to validate the discrepancy between the pipeline running in the app versus on the CLI are appreciated!

Shumakriss commented 6 years ago

Out of curiosity, I stripped out the Kinesis libraries from the demo app and replaced the appsink with a filesink and added the matroskamux element. I wanted to see if there was a difference between gst-launch and the code. It wrote a .mkv file which I could play in VLC and I was able to stream it using the demo in the parser library (I swapped clusters.mkv in the unit test from that project). Afterward, I was able to see the video in the web console.

I don't know enough to understand why I seem to be able to generate a valid MKV file but cannot put the data into the Kinesis stream properly. The network is good, I even switched to a wired connection on the Pi just to be sure. I am no longer receiving "0x32000002" errors but I still receive one "0x32000011" error early on and then later lots of dropped frames. I do see that a key frame is logged with differing DTS and PTS values. Am I correct to assume that DTS and PTS should stay relatively equal for things to run perfectly without any kind of code to correct the stream pressure?

MushMal commented 6 years ago

@Shumakriss the MKV is a very flexible format. There are a couple of reasons we use MKV as the underlying packaging format - most notably that it's more or less a standard format and it can provide for streaming by declaring sizes as "unknown". This said, the MKV standard can be used or interpreted in many ways which VLC does. For streaming (and later indexing) we put certain restrictions on it - one of the biggest ones is the fact that no frame should reference or be referenced from outside of the fragment (cluster in terms of MKV) and have the PTS == DTS for the key-frame.

If you have a stability issues in your pipeline, you can try to experiment with setting pts = dts for the key-frame being passed in.

Please resolve this issue if we don't see this anymore.. Open separate issues so the discussions can be more focused around the topic

ghost commented 6 years ago

[@Shumakriss] Hi Chris,

Sorry this issue has dragged on so long. I wanted to try to clarify the situation.

You are getting 0x32000011. This means that the SDK is seeing a DTS and PTS which are different. They must be identical or the SDK will hard reject them as invalid. There are many flavors of valid MKV and playable video which are not supported by our SDK. If you want the KVS SDK to work, you are going to have to change your encoder settings to guarantee DTS and PTS match. Our error codes are documented here: https://docs.aws.amazon.com/kinesisvideostreams/latest/dg/producer-sdk-errors.html

Alternately, if you have valid MKV which is incompatible with our SDK, you could upload it to our putMedia API directly: https://docs.aws.amazon.com/kinesisvideostreams/latest/dg/API_dataplane_PutMedia.html

If you have suggestions about how we can improve our documentation or integration guide, we take pull requests on github and also happy to hear suggestions.

thanks, Aaron

masda70 commented 6 years ago

Hi. I've started encountering this error since switching to the omxh264enc encoder on a Raspberry Pi Zero. Up to now, I was using the native x264 encoding provided v4l2src. (For this, I modified the sample code to force the use of omxh264enc.) Here is a preview of the logs when recording at 30 fps. I added a log line showing the DTS and PTS for each frame. It appears that the dropped frame systematically occurs just before a key frame. To my current understanding of this issue, not much data is lost since the DTS interval between the frames around the dropped frame is consistent with the number of frames per seconds (see the table at the end).

DEBUG - frame dts: 125861171 pts: 125861171               
DEBUG - streamDataAvailableHandler invoked for stream: xxxx and stream upload handle: 0                                                             
DEBUG - Note data received: duration(100ns): 100000 bytes: 2050 for stream handle: 0                    
DEBUG - Wrote 2050 bytes to Kinesis Video. Upload stream handle: 0                  
DEBUG - postBodyStreamingReadFunc (curl callback) invoked         
DEBUG - frame dts: 126194504 pts: 184467440737095516     
ERROR - Failed to submit frame to Kinesis Video client. status: 0x3200000b decoding timestamp: 126194504 presentation timestamp: 184467440737095516
Dropped frame!
DEBUG - Key frame!
DEBUG - frame dts: 126194321 pts: 126194321
DEBUG - streamDataAvailableHandler invoked for stream: xxxx and stream upload handle: 0
DEBUG - Note data received: duration(100ns): 100000 bytes: 42154 for stream handle: 0
DEBUG - Wrote 16372 bytes to Kinesis Video. Upload stream handle: 0
DEBUG - postBodyStreamingReadFunc (curl callback) invoked

dts	difference
125528021
125861171	333150
126194504	333333
126194321	-183
126527481	333160

MushMal commented 6 years ago

@masda70 This is a very interesting observation and I haven't seen this before. Some of my thoughts:

1) PTS can be != to DTS for non-key frames 2) 0x3200000b is an error returned from the packager which indicates that the provided timestamps are not within a valid range. This is likely due to an encoder error or something in that nature.

In this case, I believe GStreamer omx encoder returns a bad timestamp which should be checked by a pair of GST_BUFFER_PTS_IS_VALID and GST_BUFFER_DTS_IS_VALID (we should add this to the samples).

While I am not entirely sure, I believe that some of the encoders can indicate an end of sequence for a sub-frame encoding - don't quote me on it.

I will update the thread if I get more info

MushMal commented 6 years ago

Folks, can I ask you to close this issue and open a separate one if the root cause or symptoms are different.

Shumakriss commented 6 years ago

I have gotten it to work for me though I am not 100% certain as to why. I do have DTS==PTS but I am not sure that is the only thing working in my favor. I can continue with my project at this point but will eventually return to this. Since the issue must be solved outside the SDK, I am closing the issue.

Thank you for all your help @MushMal and @unicornss!

awslabs / amazon-kinesis-video-streams-producer-sdk-cpp

Consistent error "Failed to submit frame to Kinesis Video client. status: 0x32000002" on Raspberry Pi #33