Open ryantheseer opened 7 years ago
This is not part of the API of FFmpeg. We would first need to update that...
Is this something that you think others would appreciate? As in, should we request this API change in the FFmpeg project? Or should I attempt to create my own patch to the FFmpeg library and then build it and JavaCV around it?
You would have to modify FFmpeg in any case, and when that's done, you might as well try to submit the patch upstream to the developers of FFmpeg. In the meantime, we could always have a branch of the presets with your patch, sure.
To include the modified FFmpeg into JavaCV, should I just replace the downloaded tar.gz of the FFmpeg source code with my own tar.gz of the patched FFmpeg? Then I have to build javacpp and javacpp-presets before building JavaCV, correct?
Just add a patch in the directory here:
https://github.com/bytedeco/javacpp-presets/tree/master/ffmpeg
And apply it at build time from inside the cppbuild.sh
script.
How do you calculate the image creation time based on the rtcp ntp timestamp and the rtp timestamp? I have the same problem currently.
The first packet of RTCP should contain both an NTP timestamp (64bit) and the RTP timestamp (middle 32 bits of the 64-bit NTP timestamp) representing the same moment in time. From then on, every RTP packet contains just the RTP timestamp, and you use the synchronization from the first RTCP packet to calculate the full NTP timestamp from that. Make sense?
Thanks for the explanation. Does it make sense — Yes and no or maybe I understood the rfc not correctly. In the rfc it is mentioned that the rtp timestamp of the rtcp packet is not the same which is in the rtp data paket. So i see there a gap how to have a real reference from the rtcp to the rtp data pakets or do i miss something?
Every RTCP packet should come with the NTP timestamp and the RTP timestamp. The RTCP packets come at the beginning of the video stream (Source Desription packet) and regularly during the stream (Sender Reports). The RTP timestamp in the RTCP packet is the same as in the RTP packets.
When i compare the RTP timestamp of the RTCP paket and the RTP timestamp of the RTP data paket in wireshark i see a difference between both. So how do i synchronize the RTP data and RTCP paket?
The packets arrived at different times, so their timestamps should be slightly different. But you use the difference between the NTP and RTP timestamps in the RTCP packets to determine the offset between RTP timestamps and the NTP time. In Wireshark, you can see the "Timestamp, MSW" and "Timestamp, LSW" - these are the NTP timestamp at the time the RTCP packet was sent. You can also see the "RTP timestamp". Now, to determine the RTP offset, just shift the RTP timestamp left 16 bytes and then subtract from the NTP timestamp. Every RTP packet that comes later will have the RTP timestamp with the same offset from the NTP timestamp. Here is the next RTP timestamp that I got in Wireshark after the RTCP packet shown above:
If you're using Java, I suggest using the Apache Commons TimeStamp object: https://commons.apache.org/proper/commons-net/apidocs/src-html/org/apache/commons/net/ntp/TimeStamp.html
@ryantheseer:
How can I get the server's RTP time? Is this something that you think others would appreciate? As in, should we request this API change in the FFmpeg project?
This is something we're very interested in too.
Your support issue https://github.com/caprica/vlcj/issues/536 nearly exactly describes what we're looking to do, but we're doing it in C++. We too have looked into OpenCV, FFmpeg, and libVLC, and there wasn't a clear solution. We also just started looking into live555.
Our preference would be to stick with OpenCV and add the ability to get the server RTP time for the current frame position using something like cv::VideoCapture.get(cv::CAP_PROP_POS_SERVER_RTP_MSEC)
. Since OpenCV wraps FFmpeg, it would require modifying FFmpeg too, as you and @saudet already pointed out.
Did you end up modifying FFmpeg to do what you want?
One problem is still left. Since the first RTCP sender report is sent after 5 sec there is no reference ntp time for the first images. How can one overcome this problem?
@tandresen77 : When you set up the stream using RTCP handshakes, you should request an RTCP Description packet. That will provide the NTP timestamp before you issue the start command to the stream.
@jrobble Yes, at one point I modified FFmpeg, OpenCV, and the OpenCV Java wrappers in order to extract the NTP timestamp through all the layers. We found that OpenCV was very slow with TCP (buffering up to 30 seconds after a minute of streaming RTSP), and UDP streaming was very low quality for some reason, so I ended up modifying libVLC and live555 instead (and Java wrappers for libVLC) in order to accomplish the same thing. Either way, it's a bit of effort. When we went the LibVLC direction, the fix to FFmpeg wasn't necessary, because it ended up using Live555 for RTSP streaming of H264 encoded MP4s. You might be in a different situation, though!
Do you mean the SDES packet? That does not contain a timestamp as far as i can see and the rtcp sr packets are typically sent after the first rtp packet has been sent.
@tandresen77 Sorry, I was wrong about the SDES packet. If you don't want to display or record video until you receive an NTP-RTP sync packet (in a Sender Report), you could just buffer the video and throw away frames until the first Sender Report is received.
@ryantheseer:
Yes, at one point I modified FFmpeg, OpenCV, and the OpenCV Java wrappers in order to extract the NTP timestamp through all the layers
If you would direct us to those FFmpeg and OpenCV modifications we would greatly appreciate it!
so I ended up modifying libVLC and live555 instead
If you would direct us to those modifications too it would save us a lot of work with our OpenMPF project. We too are trying to determine which solution is right for us.
@jrobble Sorry it's taking so long to respond. I have not submitted any of this to be pulled into the git repositories, so I have to look through the changes I made and compile them somehow. Here's a start for the FFmpeg/OpenCV modifications. Use at your own risk - this is a hack and I make no guarantees that it doesn't break some other aspect of the open source software. OpenCV FFmpeg Changes.docx And here's a document outlining my changes to VLC and VLCJ (and some dependencies) to use VLC with Java: VLC Changes.docx
@ryantheseer, very much appreciated! I can see that you put some time into preparing these docs. I'll look them over in the next few weeks to month and get back to you if we have any questions. If we end up going with one of these solutions we may want to consider a way to make them publicly available to the rest of the world via a fork, branch, or pull request to the respective code bases.
@jrobble That would be awesome if you could make an official patch! I would love to hear about it, if you go that route.
The packets arrived at different times, so their timestamps should be slightly different. But you use the difference between the NTP and RTP timestamps in the RTCP packets to determine the offset between RTP timestamps and the NTP time. In Wireshark, you can see the "Timestamp, MSW" and "Timestamp, LSW" - these are the NTP timestamp at the time the RTCP packet was sent. You can also see the "RTP timestamp". Now, to determine the RTP offset, just shift the RTP timestamp left 16 bytes and then subtract from the NTP timestamp. Every RTP packet that comes later will have the RTP timestamp with the same offset from the NTP timestamp. Here is the next RTP timestamp that I got in Wireshark after the RTCP packet shown above:
@ryantheseer : why shift the RTP timestamp by 16 bytes?
In case I have a video packet, and the clock at 90Khz, and the same RTP and NTP timestamps mentioned in above example.
isn't the units for RTP timestamp different as compared to NTP timestamps. In such a case, would just shifting work? How would the conversion of RTP to NTP be done here ?
In case I have a video packet, and the clock at 90Khz, and the same RTP and NTP timestamps mentioned in above example. isn't the units for RTP timestamp different as compared to NTP timestamps. In such a case, would just shifting work? How would the conversion of RTP to NTP be done here ?
@venkatesh-kuppan : You should check out the RTP Standard, section 5.1.
Is this method still viable for ffmpeg 4 and python bindings for opencv? I am in a bit of a predicament in trying to obtain the timestamps through opencv's videocapture like @jrobble
You could try it with that toolchain, for sure. The concepts should be applicable. The python versus java bindings would be different, of course, but probably similar idea. Another totally differnent approach that has worked well for me is to use gstreamer instead of FFMPEG and OpenCV. There's a bit of a learning curve, but gstreamer is much more modular and allows you access to any section of the data pipeline from your application.
Yes, i am finding it a bit difficult as to how python bindings interact differently. I found the correct version of opencv serendipitously. I went the opposite though, from gstreamer to ffmpeg. Actually, the funny thing is I was trying to get some answers related to the DTS, PTS and the buffer. Couldn't quite understand how they interact with each other as I tried designing my own plugin but failed with Gstreamer. Mind if i ask you a gstreamer question (unrelated to this issue)? Just wanted to know how would you actually calculate the offsets eg. When the camera started to actually record vs when the frames are being captured.
I think we normally only worry about the PTS because that's when you want to "show" the video with meta data streams at the right time. The DTS would only be important if you wanted to inject something into the decoding module. Your question on time stamps in gstreamer depends on how the server-side gstreamer code is written. Usually I think the camera would write the time stamp right before it writes it to the RTP packets. But it may be written before that. I might not quite understand your question, though.
Oh, this makes perfect sense. One factor would be that my camera isn't hooked up to a NTP server or any time servers (keeps on failing to test and connect) so the time isn't that accurate. As for my question, apparently when the camera turns on, there will be a slight delay before it actually captures the frames. I was just wondering if you could give any advice or tutorials i can read to learn about coding such a thing in gstreamer. (I am basically a nub).
Edit: Managed to resolve the problem! had to fix the calculation a bit here and there but it worked! Thanks a lot @ryantheseer.
Would love to get this timestamp as well. What's currently the most usable implementation to get this RTC timestamp? ffmpeg in C++? Or has this not yet been integrated officially into any library?
@gerardsimons As far as I know, the best option is gstreamer, because you can enter the pipeline at any point and write your own logic. Ffmpeg only uses the RTC timestamp to align the packets, and then replaces the NTP/UTC time stamp with the "time since the beginning of the stream". I was able to hack ffmpeg to do it, but I had to learn how to compile and build ffmpeg from source myself. It does not support this as-is.
Oh, this makes perfect sense. One factor would be that my camera isn't hooked up to a NTP server or any time servers (keeps on failing to test and connect) so the time isn't that accurate. As for my question, apparently when the camera turns on, there will be a slight delay before it actually captures the frames. I was just wondering if you could give any advice or tutorials i can read to learn about coding such a thing in gstreamer. (I am basically a nub).
Edit: Managed to resolve the problem! had to fix the calculation a bit here and there but it worked! Thanks a lot @ryantheseer.
No problem! Sorry I wasn't able to provide more on gstreamer development. It's a bit of a learning curve, for sure! Very powerful, though.
@gerardsimons As far as I know, the best option is gstreamer, because you can enter the pipeline at any point and write your own logic. Ffmpeg only uses the RTC timestamp to align the packets, and then replaces the NTP/UTC time stamp with the "time since the beginning of the stream". I was able to hack ffmpeg to do it, but I had to learn how to compile and build ffmpeg from source myself. It does not support this as-is.
Thanks @ryantheseer ! Greatly appreciate it. An OpenCV solution would be the nicest for our system right now. If you would make the changes in ffmpeg (I guess that's what you outlined in your docx?) would it be trivial to then have OpenCV access these new ffmpeg attributes or is that not easy at all? Was there no interest from the ffmpeg community to integrate your changes somehow?
If you would make the changes in ffmpeg (I guess that's what you outlined in your docx?) would it be trivial to then have OpenCV access these new ffmpeg attributes or is that not easy at all? Was there no interest from the ffmpeg community to integrate your changes somehow?
Yes, I successfully accessed the RTP timestamps using the OpenCV VideoIO component with ffmpeg after I made those changes. I did not try to submit the changes to ffmpeg, because 1) it's a hack and 2) I chose not to go that route once I found that OpenCV VideoIO was very slow and laggy over TCP/IP to the point of not being usable. UDP might have worked, but gstreamer and VLC were still better.
Oh, this makes perfect sense. One factor would be that my camera isn't hooked up to a NTP server or any time servers (keeps on failing to test and connect) so the time isn't that accurate. As for my question, apparently when the camera turns on, there will be a slight delay before it actually captures the frames. I was just wondering if you could give any advice or tutorials i can read to learn about coding such a thing in gstreamer. (I am basically a nub). Edit: Managed to resolve the problem! had to fix the calculation a bit here and there but it worked! Thanks a lot @ryantheseer.
No problem! Sorry I wasn't able to provide more on gstreamer development. It's a bit of a learning curve, for sure! Very powerful, though.
No problem @ryantheseer . I was actually in a pickle when I was testing the feasibility of your hack. It came as a no go actually. Used gstreamer and got the timestamps from the buffer instead but I don't know if the buffer gives random timestamps or just timestamps from some clock. Might have to go explore more on that.
If you would make the changes in ffmpeg (I guess that's what you outlined in your docx?) would it be trivial to then have OpenCV access these new ffmpeg attributes or is that not easy at all? Was there no interest from the ffmpeg community to integrate your changes somehow?
Yes, I successfully accessed the RTP timestamps using the OpenCV VideoIO component with ffmpeg after I made those changes. I did not try to submit the changes to ffmpeg, because 1) it's a hack and 2) I chose not to go that route once I found that OpenCV VideoIO was very slow and laggy over TCP/IP to the point of not being usable. UDP might have worked, but gstreamer and VLC were still better.
It really lagged when I was using an OCR to read timestamps from a monitor to actually see if I could get the actual time. Do you mean using libVLC with gstreamer in the other document you posted? I thought it was with java opencv and libVLC? I am really interested in trying to sync some cameras using these RTP timestamps. Sounds like an interesting task to do.
Do you mean using libVLC with gstreamer in the other document you posted? I thought it was with java opencv and libVLC? I am really interested in trying to sync some cameras using these RTP timestamps. Sounds like an interesting task to do.
No, sorry, I meant that I tried both gstreamer and VLC separately, not combined together. In one effort, I was able to hack libVLC (and some open source Java wrappers for it) to get the RTP time stamps and successfully synchronize using them. When I realized that gstreamer allowed access to the RTP time stamps without hacking, I stopped using libVLC and switched to gstreamer. Part of the reason I tried hacking VLC after OpenCV/ffmpeg was that I found that the gstreamer java bindings available online were not up-to-date at the time. Since then, the java bindings were updated and I was able to use the latest gstreamer with them. Let me know if that makes more sense.
Do you mean using libVLC with gstreamer in the other document you posted? I thought it was with java opencv and libVLC? I am really interested in trying to sync some cameras using these RTP timestamps. Sounds like an interesting task to do.
No, sorry, I meant that I tried both gstreamer and VLC separately, not combined together. In one effort, I was able to hack libVLC (and some open source Java wrappers for it) to get the RTP time stamps and successfully synchronize using them. When I realized that gstreamer allowed access to the RTP time stamps without hacking, I stopped using libVLC and switched to gstreamer. Part of the reason I tried hacking VLC after OpenCV/ffmpeg was that I found that the gstreamer java bindings available online were not up-to-date at the time. Since then, the java bindings were updated and I was able to use the latest gstreamer with them. Let me know if that makes more sense.
oh it does make sense, the problem I have with Gstreamer mainly is how do you know what clock gstreamer uses/initializes when you're accessing the buffer? Could not really correlate to the exact absolute time at which the frame was captured in gstreamer which really confused me.
the problem I have with Gstreamer mainly is how do you know what clock gstreamer uses/initializes when you're accessing the buffer?
In my case, the GStreamer client was easier to develop because we (as a product team) are in charge of both the camera/server-side implementation and the app/client-side implementation. In the camera GStreamer code, we are replacing the GST pipeline time with the system time, and in fact we're using a separate KLV payload stream to give the system clock time instead of the RTP time. If you don't have access to the server side, you may have to find another way to match up the RTP time to NTP time.
@ryantheseer : that sounds interesting, any chance you could share something about your changes to the GStreamer code / the pipeline?
I can't provide all the details, but here is some of the GStreamer server-side pipeline. The time stamp is being written into a KLV payload to be sent as a separate RTP stream. In our case, we're using something called the V4L2 source in our Linux environment, which just provides the image frames. And part of the client-side pipeline, which decodes the KLV payload to get the timestamp.
In the camera GStreamer code, we are replacing the GST pipeline time with the system time
Oh wow, this is very interesting. Might I ask how would you put the system time in the GST pipeline? or can you hint what function I can use to accomplish this?
in fact we're using a separate KLV payload stream to give the system clock time instead of the RTP time. If you don't have access to the server side, you may have to find another way to match up the RTP time to NTP time.
Do you mind explaining on what is a KLV payload stream?
Do you mind explaining on what is a KLV payload stream?
It's a type of RTP payload that is defined as a "Key-length-value" payload. We used the "Precision Time Stamp" key defined by "MISB ST 1603.1" https://tools.ietf.org/id/draft-ietf-avt-rtp-klv-01.html https://gwg.nga.mil/misb/st_pubs.html
Might I ask how would you put the system time in the GST pipeline? or can you hint what function I can use to accomplish this?
You should be able to add a KLV RTP stream as shown in the example pipeline, and put in whatever extra information you want into the key-length-value triplet packets.
@bradh Some more talk about KLV here, if you have any ideas, please jump in!
@ryantheseer I am facing the audio/video synchronization issue. In my scenario, audio comes first and video comes latter with a delay 2-3 seconds. I would like to understand more here on the RTP timestamp and NTP Time so that introducing the delay on rtp_timestamp field in RTCP Sender report itself so that on the receiver side, audio/video will synchronize.
From this comments, Now, to determine the RTP offset, just shift the RTP timestamp left 16 bytes and then subtract from the NTP timestamp. Every RTP packet that comes later will have the RTP timestamp with the same offset from the NTP timestamp.
1) 16bytes or 16bits ? 2) How to confirm that next RTP packets have the same offset ?
If anyone has any suggestions, just throw it out here for the discussion.
@Baka7
@ryantheseer. Thank you so much for the reply. It means a lot Ryan I am not using the Ffmpeg. WebRTC takes care of generating the rtp timestamps for audio and video.
If there is an issue, I would like to understand that how can we map on NTP and RTP timestamp so that I can change the timestamp for audio/video based on the need.
In my case, I am seeing rtp_timestamp when sending RTCP sender report not always increasing, is this expected or it can cause any issues for audio/video synchronisation?
When we have to send the audio and video RTCP sender report ? Because in my scenario , rtcp sender report for audio goes 5seconds after the video sender report , is there anything I am missing ?
@Baka7 I think we're past my area of expertise, I never even had audio streams to work with on the project I did 4 years ago... Good luck!
@ryantheseer Okay Ryan. Got it but have few questions on RTCP. I need the understanding of how RTCP sender report used in synchronising the audio /video and so just share the details about it if you have anything in your mind. Always the RTCP rtp_timestamp should be in increasing order right ?
Yes, timestamps should be monotonically increasing. The reason the sender report is needed is because the RTP timestamp doesn't have as many bits, so you need to adjust accordingly using the RTCP time stamps (which is built on the same clock, but just more bits). All of these details should be available in the RTP/RTCP specifications available online.
I have created an RTSP/RTCP/RTP client written in Java, communicating with a remote video server serving H.264-encoded MPEG video through RTP. I do not have any audio tracks to stream. I see in the FFMPEG source code that in libavformat/rtpdec.c::rtp_parse_packet_internal(RTPDemuxContext s, AVPacket pkt, const uint8_t buf, int len), it reads out the RTP time stamp from the RTP packet. This time stamp, when combined with the NTP time stamp sent in the RTCP protocol in the Sender Reports, can be used to determine the exact time the image frame was sampled and sent by the server. Then, in finalize_packet(RTPDemuxContex s, AVPacket *pkt, uint32_t timestamp), this time stamp is added to the RTPDemuxContext object and used to calculate some other time stamp values. The question is: How do I get access to this RTPDemuxContext->timestamp from the Java side of the JNI? I DON'T want the time provided by javacpp.avutil.av_frame_get_best_effort_timestamp(avFrame), because that is simply a calculation of the time elapsed since the streaming started, based on the frame rate. How can I get the server's RTP time from within Java?