Cybis320 commented 5 months ago

To increase the accuracy of timestamps, BufferedCapture.py was refactored to utilize GStreamer Presentation Timestamp (pts) feature, which is unaffected by client latency.

PyGObject is now required in vRMS, which has been added to requirements.txt. Alternatively, for testing purposes, execute pip install PyGObject within vRMS.

OpenCV continues to be utilized for processing video files (though this remains untested) or if force_v4l2 is set to true in .config (also untested). If desired, a force_cv2 flag could similarly be added. Otherwise, a GStreamer pipeline utilizing software decoding through avh264dec is employed. the code has been tested with GStreamer1.0 v1.22 - compatibility with v1.14.4 remains untested.

Ad-hoc latency adjustments specific to IMX291 IP cameras are incorporated in BufferedCapture.py. Cameras of other types have not been evaluated. If this implementation progresses, it might be prudent to relocate the latency correction factors to .config for better manageability.

dvida commented 5 months ago

Amazing work! Before we proceed to testing and merging, would you mind doing the following:

Preserve the previous functionality and only enable new features via a config option. This is not to break 1000+ existing installations and only manually migrate to this new feature.
Move the hardcoded latencies into the config file.

satmonkey commented 5 months ago

The code runs fine as usual (branch=dev), what are the modification to the .config expected? What about the device pipeline?

satmonkey commented 5 months ago

ok, I see the PR is not merged yet on origin. When merging locally, the capture on the Pi5 4gb is not consistent, and most of the time I get:

Cybis320 commented 5 months ago

Does it work sometimes and not other times? Do you have live_maxpixel_enable: true in config?

On Jan 22, 2024, at 1:42 PM, Milan Kalina @.***> wrote:

ok, I see the PR is not merged yet on origin. When merging locally, the capture on the Pi5 4gb is not consistent, and most of the time I get: image.png (view on web) https://github.com/CroatianMeteorNetwork/RMS/assets/5328519/e2ddb6e4-9c4b-4b39-b92d-e3b9697eba14 — Reply to this email directly, view it on GitHub https://github.com/CroatianMeteorNetwork/RMS/pull/257#issuecomment-1904772087, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7XE74ASKJARLXKAJDMOP4TYP3FKRAVCNFSM6AAAAABCC6WSWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBUG43TEMBYG4. You are receiving this because you authored the thread.

satmonkey commented 5 months ago

Yes, I do, but without live preview, it is the same, it does not even start. The issue seems to be not working pipeline. Maybe you can indicate the opencv version and the working pipeline string for the gst-launch.

Cybis320 commented 5 months ago

Thanks for trying it out. I just tried it and it does work for me with Live view enabled.

I use opencv 4.1.2 with Gstreamer 1.22.0 on Raspberry Pi 64-bit (Debian 12) on a RPi4 and 5. What versions do you use?

I changed opencv4_install.sh to build it:

!/bin/sh

if [ "$#" -ne 1 ] || ! [ -d "$1" ]; then echo "Usage: $0 VIRTUALENV_PATH" >&2 exit 1 fi

The script takes the path to the virtual environment directory as the argument

Configuration

HOME_DIR=$HOME VERSION=4.1.2

Installation

sudo apt-get update sudo apt-get upgrade -y sudo apt-get install -y build-essential cmake pkg-config sudo apt-get install -y libjpeg-dev libtiff5-dev libjasper-dev libpng12-dev sudo apt-get install -y libavcodec-dev libavformat-dev libswscale-dev libv4l-dev sudo apt-get install -y libxvidcore-dev libx264-dev sudo apt-get install -y gstreamer1.0-tools sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev sudo apt-get install -y libgtk-3-dev sudo apt-get install -y libatlas-base-dev gfortran

cd ${HOME_DIR} wget -O opencv.zip https://github.com/Itseez/opencv/archive/${VERSION}.zip unzip -o opencv.zip wget -O opencv_contrib.zip https://github.com/Itseez/opencv_contrib/archive/${VERSION}.zip unzip -o opencv_contrib.zip

cd ${HOME_DIR}/opencv-${VERSION}/ mkdir build cd build cmake -D CMAKE_BUILD_TYPE=DEBUG \ -D CMAKE_INSTALL_PREFIX=${1} \ -D INSTALL_C_EXAMPLES=OFF \ -D OPENCV_EXTRA_MODULES_PATH=${HOME_DIR}/opencv_contrib-${VERSION}/modules \ -D PYTHON_DEFAULT_EXECUTABLE=$HOME_DIR/vRMS/bin/python3.7 \ -D PYTHON_INCLUDE_DIR=/usr/include/python3.7 \ -D WITH_opencv_python3=ON \ -D HAVE_opencv_python2=OFF \ -D ENABLE_NEON=ON \ -D ENABLE_VFPV3=ON \ -D WITH_OPENMP=ON \ -D BUILD_TIFF=ON \ -D WITH_TBB=ON \ -D BUILD_TBB=ON \ -D BUILD_TESTS=OFF \ -D WITH_EIGEN=OFF \ -D WITH_V4L=ON \ -D WITH_LIBV4L=ON \ -D WITH_VTK=OFF \ -D OPENCV_EXTRA_EXE_LINKER_FLAGS=-latomic \ -D OPENCV_ENABLE_NONFREE=ON \ -D INSTALL_PYTHON_EXAMPLES=OFF \ -D BUILD_NEW_PYTHON_SUPPORT=ON \ -D BUILD_opencv_python3=TRUE \ -D WITH_GSTREAMER=ON \ -D WITH_FFMPEG=ON ..

Check output cmake, it should include python 2

For more information check: http://www.pyimagesearch.com/2016/10/24/ubuntu-16-04-how-to-install-opencv/

make -j4 sudo make install sudo ldconfig

cd ${HOME_DIR} rm -rf opencv-${VERSION} opencv_contrib-${VERSION} opencv.zip opencv_contrib.zip

On Jan 22, 2024, at 2:44 PM, Milan Kalina @.***> wrote:

Yes, I do, but without live preview, it is the same, it does not even start. The issue seems to be not working pipeline. Maybe you can indicate the opencv version and the working pipeline string for the gst-launch.

— Reply to this email directly, view it on GitHub https://github.com/CroatianMeteorNetwork/RMS/pull/257#issuecomment-1904873598, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7XE74GRIVJDYMFA3SMUA4DYP3MUZAVCNFSM6AAAAABCC6WSWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBUHA3TGNJZHA. You are receiving this because you authored the thread.

satmonkey commented 5 months ago

In the Bookworm pi5 pre version, we are using v4.9.0 or 4.8.1. v4.1.2 is outdated, in the Bullseye 64-bit, we use v4.5, I believe. I am afraid I am not able to make it with the pipeline generated, also by using gst-launch:

rtspsrc protocols=tcp tcp-timeout=5000000 retry=5 location="rtsp://192.168.150.14:554/user=admin&password=&channel=1&stream=0.sdp/" ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! video/x-raw,format=BGR ! appsink max-buffers=25 drop=true sync=1 name=appsink

Cybis320 commented 5 months ago

Argh, you are right, I was looking at the wrong branch in my fork. I’m using OpenCV 4.9.0 built with Gst 1.22.0 and Raspberry OS 64-bit with the following install script (it should be the correct one now): Also, if you just type this in the a cmd line (with trailing slash), anything?

gst-launch-1.0 rtspsrc location='rtsp://192.168.42.14:554/user=admin&password=&channel=1&stream=0.sdp/

On Jan 22, 2024, at 3:27 PM, Milan Kalina @.***> wrote:

In the Bookworm pi5 pre version, we are using v4.9.0 or 4.8.1. v4.1.2 is outdated, in the Bullseye 64-bit, we use v4.5, I believe. I am afraid I am not able to make it with the pipeline generated, also by using gst-launch:

rtspsrc protocols=tcp tcp-timeout=5000000 retry=5 location="rtsp://192.168.150.14:554/user=admin&password=&channel=1&stream=0.sdp/" ! rtph264depay ! h264parse ! avdec_h264 ! videoconvert ! video/x-raw,format=BGR ! appsink max-buffers=25 drop=true sync=1 name=appsink

— Reply to this email directly, view it on GitHub https://github.com/CroatianMeteorNetwork/RMS/pull/257#issuecomment-1904937284, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7XE74CUZ6XAH76DHY3ZAELYP3RUBAVCNFSM6AAAAABCC6WSWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBUHEZTOMRYGQ. You are receiving this because you authored the thread.

satmonkey commented 5 months ago

Finally got it working on Buster, the current RMS mainstream OS. On Bookworm, the GST version is 1.22, and the Wayland env makes it even more complex for videosinks to work. The line 237 should read (add the space before 4th ": f"location=\"{device_url}\" ! "

Then, I started the StartCapture for 6 minutes: python -m RMS.StartCapture -d 0.1 The log is attached. log_CZ0001_20240123_065803.590313.log

The camera is IMX307 + Hi3516C V300, and I can easily provide the same for the IMX291.

satmonkey commented 5 months ago

The IMX307 camera was in CBR mode. Below you find the log for the IMX291 + Hi3516C V300 in VBR mode:

log_CZ0001_20240123_071647.581237.log

satmonkey commented 5 months ago

Now, also Bookworm and pi5 started running fine, looks like the gst 1.22 requires the slash after "sdp" in the location string, this made me confused with running using gst-launch. Again the camera is IMX291 + Hi3516C V300 in VBR mode: log_CZ0001_20240123_092901.656577.log It seems that the delay is pretty consistent here.

Cybis320 commented 5 months ago

Excellent! Thank you for sharing the logs.

And yes, there was a change between GStreamer 1.14 and 1.22 in how it appends instructions to the URI. It turns out our IP cameras are non-compliant with RTSP standard. I filed a bug report with GStremer, but essentiallyl, the issue lies with the cameras. The walkaround is to ensure the URL ends with a “/". I believe my code should automatically add one if it’s missing in the .config deviceID.

Link to the bug report: https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/3222

Regarding line 237: f"location=\"{device_url}\" !" "rtph264depay ! h264parse ! avdec_h264")

Good catch, it should be: f"location=\"{device_url}\” ! " "rtph264depay ! h264parse ! avdec_h264")

I noticed in the first two logs you provided (presumably on a RPi Buster?) that the system is not keeping up, as evidenced by the increasing frame age. I’ve added this line to the log: Frame is 2.081 seconds old. Total dropped frames: 0

The code compares the current time to the timestamp of the retrieved frame. Increasingly older frames indicate the system is not keeping pace. Could you try reducing the GPU memory allocation to see if it makes a difference?

sudo nano /boot/config.txt gpu_mem=16 sudo reboot

For checking whether the timestamps are in the right ballpark, there's an app called ‘AtomicClock’ on both Apple and Android. It displays a somewhat accurate time, suitable for placing in front of the camera. It’s nowhere near millisecond accurate, more like 200ms, but it’s a start. The app is always at least 60ms behind the phone time due to not accounting for display latency. The phone display also has a slow refresh rate. Moreover, it uses NTP time, not GPS.)

Cheers, Luc

On Jan 23, 2024, at 2:39 AM, Milan Kalina @.***> wrote:

Now, also Bookworm and pi5 started running fine, looks like the gst 1.22 requires the slash after "sdp" in the location string, this made me confused with running using gst-launch. Again the camera is IMX291 + Hi3516C V300 in VBR mode: log_CZ0001_20240123_092901.656577.log https://github.com/CroatianMeteorNetwork/RMS/files/14021619/log_CZ0001_20240123_092901.656577.log It seems that the delay is pretty consistent here.

— Reply to this email directly, view it on GitHub https://github.com/CroatianMeteorNetwork/RMS/pull/257#issuecomment-1905661532, or unsubscribe https://github.com/notifications/unsubscribe-auth/A7XE74DIVVDN4AMIJZTW4NDYP6AN5AVCNFSM6AAAAABCC6WSWWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBVGY3DCNJTGI. You are receiving this because you authored the thread.

satmonkey commented 5 months ago

Hi Luc, yes, the 1st two logs were from Buster, followed by that one from Bookworm. Now, I reduced the gpu_mem to 16 MB and did another run with Buster. The delay was again increasing up to 42s after 6 minutes. Milan

Cybis320 commented 5 months ago

I added a force_cv2 flag in .config which, when set to true, forces the code to use cv2 as before. Also, I have hopefully fixed the logging of dropped_frame for real this time by using multiprocessing.Value. Simple integer attributes are not shared between processes in Python's multiprocessing, which meant it was always reporting zero dropped frames. Now, we should be able to accuratly compare the number of dropped frames between methods. One caveat is that the dropped frame detection for the cv2 method might be affected by the inaccurate cv2 timestamps.

satmonkey commented 5 months ago

I am attaching the log from Buster, having force_cv2: true. Let me know if the pi5 Bookworm log is needed. log_CZ0001_20240129_070959.377111.log

Cybis320 commented 5 months ago

I am attaching the log from Buster, having force_cv2: true. Let me know if the pi5 Bookworm log is needed. log_CZ0001_20240129_070959.377111.log

Thank you. Setting force_cv2 to true should revert the code's behavior to its original state regarding the method of retrieving frames and timestamps. However, dropped frames are still being reported even if report_dropped_frames is set to false: Frame is 0.000 seconds old. Total dropped frames: 974

The code uses the same method as before to determine if frames are dropped: (frame_timestamp - last_frame_timestamp) >= self.time_for_drop

This method would be accurate if the timestamps were reliable. It works well with GStreamer PTS timestamps, but it's unclear whether it has ever been a valid method for checking dropped frames with cv2. It seems to indicate that the loop is slow and not always retrieving frames on time, even though all frames may eventually be retrieved. At a minimum, it suggests that the timestamps of these 'dropped' or delayed cv2 frames are even more inaccurate.

Maybe the wording of the logging should be adjusted for the cv2 case? Total late frames: 974?

Cybis320 commented 5 months ago

The source of the slow GStreamer performance is video/x-raw,format=GRAY8 When setting it to video/x-raw,format=BGR, the code is running great. I'll push a fix today.

Cybis320 commented 5 months ago

I've updated the handling of color channels. The GStreamer pipeline now always converts I420 to video/x-raw,format=BGR. The read method drops two channels if the initVideoDevice method determines that all three color channels contain identical data (indicating a grayscale image).

Even If the camera's DayNightColor setting is set to 2 (Night), the camera still outputs I420 with Yuv data (where the U and V channels contains zeros). The pipeline now converts it to BGR (since converting to gray using videoconvert is significantly slower), the initVideoDevice detects the absence of color information, prompting the read method to drop the two redundant channels.

If the camera's DayNightColor is set to 1 (Day), I preserved the option to utilize color information as long as possible in the code. This data is eventually converted to grayscale just before compression. For instance, this allows for the potential to save colored JPEGs at a certain stage if desired.

With this revised approach, GStreamer can now handle 1080p at 25 fps with DayNightColor set to either day or night without dropping frames.

Additionally, I've refined the logging logic. It now logs different timestamp-related data for the cv2 and GStreamer methods and keeps track of the poorest performance in a loop cycle:

# For cv2, show elapsed time since frame read to assess loop performance:
"Cycle max frame interval: 2.015 (normalized). Run late frames: 194"
# For GStreamer, show elapsed time since frame capture to assess sink fill level:
"Cycle max frame age: 3.015 seconds. Run dropped frames: 0"

For cv2, the frame interval is normalized to 1/fps, so a number greater than 1 indicates that the loop is too slow. For GStreamer, the log shows the time interval between when the frame was captured by the camera sensor and when it arrived in the loop. This duration should typically be less than 5 seconds. Times increasing beyond that indicate that the loop is lagging.

I've compared the reported dropped frames between the master branch and this PR with force_cv2: true, and the results are similar.

satmonkey commented 5 months ago

I made additional 6 min test. No idea why, but the number of droppped frames gets even higher. The camera is in B/W mode.

log_CZ0001_20240130_072231.501357.log

satmonkey commented 5 months ago

setting the force_cv2=false does the trick with the late frames, obviously. On the other hand, the 'cycle max frame age' is rather behind. log_CZ0001_20240130_074216.466353.log

Cybis320 commented 5 months ago

Thanks for testing. It's clear it's not keeping up in both cases. It's possible something escaped my attention, but force_cv2=true should revert to the behavior seen in the master branch, except for logging. So, it's likely it wouldn't keep up on the master branch either. Could you test the master branch on the same SD card with report_dropped_frames: true? (The initial reason I realized there was an issue was due to a degrading SD card. There were no other signs that anything was off.)

satmonkey commented 5 months ago

yes, also using the master branch shows dropped frames (10-15 per FF file) I recall also pi5 does the same, but will confirm tomorrow for both pi5 and pi4.

dvida commented 5 months ago

Hi guys - great work on this! One short note - even though the frames may be late, they are not technically dropped in the traditional sense as with analog cameras. If we had dropped frames, most meteors and satellites would show gaps. This is not the case for any operational cameras. I did see some that were not configured well or had cabling issues. That's why the FF files operate under the assumption that the relative time between frames is always the same and follows 1/FPS, but the absolute time of individual frames is less well defined. I'd love to discuss this more, but I'm unsure that tracking these "dropped frames" is informative anymore, unless analog cameras and video digitizers are used.

Cybis320 commented 5 months ago

Thank you for giving context on the dropped frame reporting. I agree that the frames are most likely not dropped, which is the reason I rephrased the log entry in the PR to 'dropped or late'. However, I believe monitoring these numbers is still valuable.

The methodology for determining a dropped frame by checking the interval length is indeed ambiguous. Frames are timestamped late either because of a slow loop - resulting in delayed and inaccurate timestamping for cv2 - or because of the loop waiting on a frame from the camera - likely dropped frame. In either case, it's not great news.

In the case of cv2, with the current interval threshold of 1.5/fps, a frame is reported if it's more than 20 ms late. That's probably overly sensitive for cv2, which lacks this precision level. Perhaps adjusting the threshold to 2.0/fps, corresponding to a 40ms delay at 25fps, would be more appropriate?

The log above indicates that cv2 errors exceeded 3/fps, marking frames over 80ms late at 720p. Something is off, no?

GStreamer is capable of high precision and is not affected by slow loops, timestamps are right on time (expect for the first 2-3 frames). So there is no ambiguity that long intervals correspond to missing frames. A threshold of 2.0/fps would also be appropriate, I think.

Do we still need 1.5/fps for analog camera?

FFs take the timestamp at iteration 0 and assume 1/fps for the next 255 frames. In the extremely rare case that frames do get dropped during a meteor event, it would be obvious with gaps in streaks, and it sounds like that's never happened. However, from FF to FF files, the timestamp at frame 0 will be affected by cv2 variable timestamping. That relative error is easy to quantify by measuring the interval between FF timestamp. That could be logged. (I'm doing that automatically on my machine, let me know if you'd like a PR with it). That variability would be random from camera to camera.

Then there's the issue of absolute accuracy. How to you feel about the latency offset? Right now, this PR leaves the cv2 method unchanged with timestamps being on average around 150 ms behind (fast loop waiting on frames) to 300 ms behind (slow loop 4/fps behind). That would be relative to UTC and the GStreamer machines. I would imagine that consistency between camera is a concern.

Here's an example of gaps in aircraft light streaks caused by dropped frames due to a bad SD card:

dvida commented 5 months ago

That's a great point. So the way I usually detect dropped frame and framerate issues is by looking at the difference in timestamps between neighbouring FF files. It should always be 256/FPS seconds (10.24 for 25 FPS). An FF is saved once 256 frames are collected. You've done something similar just on a 128-frame basis in the email you sent. This approach tends to smooth out any short-term variability in the frames being late. Perhaps that might be the best measure of actual frame drops? Those plots you sent are an amazing diagnostic tool. Perhaps we can make a script which takes a list of FF (or FS) files, extracts the timestamps and makes the plot with the differences, also showing the expected difference? We could then include this in the normal RMS workflow. We can also of course keep the dropped frame logs, but I found that they generate a lot of questions from operators who don't understand that the frames are not necessarily dropped when they're reported as dropped.

Cybis320 commented 5 months ago

Agreed, I'll put something together. I was saving a jpg every 128 frame. The script just feeds on whatever folder it's pointed at. So, if aimed at the captured folder, it should plot on 256 frames basis, adds a 'timestamp variability score' in the log and saves the plot to tar/archive.

dvida commented 5 months ago

If I may suggest, the FS*.bz2 file contains the list of all files recorded (with full timestamps), regardless of whether the CapturedFiles directory is present or not. This way the plot can always be generated even if only the ArchivedFiles archive is available.

markmac99 commented 5 months ago

I have not read the code yet, but i noticed f-strings being used in the snippets above. Note that f-strings are not compatible with python 2.7 so this would prevent the code installing on a Pi3. Worth double checking as i know there are many Pi3 based systems still in operation.

Cybis320 commented 5 months ago

Argh, sorry about that, will fix.

Cybis320 commented 5 months ago

f-strings are gone

markmac99 commented 5 months ago

@Cybis320 thanks - one day we will be able to drop Pi3 support but not quite yet!

Cybis320 commented 5 months ago

Updated requirements.txt to conditionally install PyGObject only for Python versions 3.6 and above:

PyGObject; python_version >= '3.6'

In BufferedCapture.py, I wrapped the gi import statement in a try-except block:

try:
    import gi
    gi.require_version('Gst', '1.0')
    from gi.repository import Gst
except ImportError as e:
    log.info('Could not import gi: {}. Using OpenCV.'.format(e))

Additionally, I enclosed the entire GStreamer initialization process in a try-except block. This ensures a fallback to OpenCV if the GStreamer initialization fails.

Cybis320 commented 4 months ago

I added a queue to the GStreamer pipeline before the videoconvert element. It almost doubles the performance. I'm curious if it will help on machines having issues keeping up. With this pipeline I can reduce the cpu clock from 1.5GHz to 800MHz and the frame buffer is not increasing (RPi4 Bookworm 1080p @ 25fps).

Cybis320 commented 4 months ago

Fixed the buffer behavior. If the machine is slow, the frame buffer will fill up with approximately 10 seconds' worth of frames, at which point it will start dropping frames piecemeal.

Cybis320 commented 4 months ago

I wanted to recap my notes in one place regarding the GStreamer pipeline. After quite a bit of testing, I think the following pipeline is reliable and efficient:

   rtspsrc protocols=tcp tcp-timeout=5000000 retry=5 
   location="rtsp://192.168.42.10:554/user=admin&password=&channel=1&stream=0.sdp/" ! 
   rtph264depay ! h264parse ! avdec_h264 ! 
   queue leaky=downstream max-size-buffers=100 max-size-bytes=0 max-size-time=0 !
   videoconvert ! video/x-raw,format=BGR ! 
   queue max-size-buffers=100 max-size-bytes=0 max-size-time=0 ! 
   appsink max-buffers=100 drop=true sync=0 name=appsink

Here are a couple of notes:

I haven’t tested tcp-timeout and retry, not sure if they work as intended.
Slash at the end of the location is required (the code adds one if not present).
Only avdec_h264 works - v4l2_h264dec is not reliable. The software decoder performs just fine even with the CPU throttled down to 800MHz while processing 1080p@25fps.
Putting a queue before videoconvert improved performance significantly.
To get the buffers upstream of the appsink to behave correctly, adding a second queue after videoconvert, and setting leaky=downstream to the upstream queue only, was necessary. There’s possibly a bug is GStreamer, and only this specific configuration seems to work.
Both queues need all three limits specified, i.e., max-size-buffers, max-size-bytes, and max-size-time. One needs to be controlling, and the others set to zero.
videoconvert to BGR is needed as GStreamer videoconvert to GRAY8 is broken (the code handles the conversion).
The appsink buffer requires drop=true, otherwise, it blocks instead of dropping.
sync=0 indicates it's not waiting for the correct pts to serve a frame. As soon as a frame reaches the appsink, it is available for retrieval. There is no need to wait since we are not viewing the frames - pts timestamping is not affected.
Regarding the sizes of all three buffers (upstream of videoconvert, upstream of the appsink, and the appsink buffer), the buffers can fill up independently or together. For example, slowing down the CPU causes the two buffers upstream of the appsink to fill up. Slowing down the frame retrieval in bufferedCapture.py by introducing a delay causes the appsink to fill up, but not the other two. Slowing down both the CPU and frame retrieval causes all three to fill up. In such scenarios, the memory size becomes limiting. Setting the buffers too big will cause a system freeze. Buffer sizes were tested on a 4GB RPi4 Bookworm, and with all three filled up, the total memory usage for the code peaked at ~57%.

I'm curious if these observations holds true on other machine types.

Cybis320 commented 3 months ago

I’ve added a new smoothing function for the GStreamer timestamps, which should provide sub-millisecond precision.

The OpenCV method uses the time a frame is retrieved from the buffer, whereas GStreamer’s presentation timestamp (pts) is correlated with the time a frame arrives in the buffer. This distinction alone is a significant improvement. However, it remains susceptible to small variations in network delay.

The GStreamer buffer offers several modes intended to alleviate this issue, but unfortunately, they don't work well for our purposes. There are also built-in tools to remove jitter, but these are not very effective either. It's possible that GStreamer's performance is impacted by the unusual fps of some of our cameras (24.98).

The new smoothing function aims to eliminate variations in frame arrival times by precisely measuring the actual fps and selecting the least-delayed frames. Achieving a stable and accurate fps value takes about 10 minutes. I’ve tested the time stability of my imx291 camera by flashing an LED exactly in sync with this calculated fps, and the camera seems stable to within 1ms over 6 hours.

Latching on to the least-delayed frames helps achieve a consistent delay from run to run on a given machine, although each setup will have a different delay.

To ensure accuracy even during the first 10 minutes, the code uses the fps from the config file as a starting assumption. It regularly checks that the calculated fps converges with the config fps. If the calculated fps stops converging, the startup sequence exits, and the system transitions to the calculated fps. If both values are sufficiently close at the end of the startup sequence, the transition is smooth. If not, the transition is abrupt, and the config fps is flagged as inaccurate. In that case, the config fps is ignored in subsequent reconnects for that night's run, in case of any disconnection.

In all cases, the first three blocks of 256 frames can skew as needed to compensate for network delays. After that, skew is limited to 25 microseconds per block to track frame arrival times. In practice, this approach is effective. The timestamps match something like 95% of the LED flashes (ok, I didn’t really count) to within 1 ms over a night’s run. It’s challenging to determine whether the camera or the LED is the limiting factor.

Two new fields have been added to the config file: camera_latency and media_backend. camera_latency allows specifying a time offset in seconds, with the zero point matched to my camera setup. media_backend can be set to gst, cv2, or v4l2. If camera_backend is not set or is set to gst, the code first tries to use the GStreamer backend directly (bypassing OpenCV). If this fails, it reverts to using OpenCV. The cv2 option forces the code to use OpenCV (which may in turn use ffmpeg or GStreamer as a backend), while the v4l2 option tells the code to try OpenCV with v4l2, falling back to plain vanilla cv2 if that fails. The smoothing function only works in the GStreamer standalone mode.

Several fallback behaviors are reported in the log, including the aforementioned gst>cv2 and v4l2>cv2 fallbacks, as well as the config fps > calculated fps fallback. Additionally, there's a fallback for excessive dropped frames. When dropped frames are detected, the smoothing algorithm parameters are reset, causing a temporary drop in accuracy. If this occurs too often, the code disables smoothing for the remainder of the night and falls back to the raw fps, which are slightly less accurate (perhaps ~100ms).

The smoothing function and GStreamer don’t seem to take up too much resource. I’ve throttled down my RPi4 from 1.5 GHz to 600 MHz and the code is keeping up fine with 720p at 25 fps.

To fully benefit from the smoothing function, timestamps should be recorded with microsecond precision – this will be addressed in a separate PR.

For Raspberry Pi machines, I’m developing a utility to calibrate the camera latency by flashing the Pi’s power LED at the camera. This should achieve millisecond level latency relative to the Pi, although I’m not sure how to estimate the led latency yet. If the Pi is synchronized to a GPS module, it should then be possible to achieve absolute millisecond accuracy... maybe.

I would appreciate it if you could test this code while keeping an eye on the logs for fallback behaviors. Ideally, you should see these log entries:

It should report initializing GStreamer Standalone Device.
If in GStreamer standalone mode, it will report something like Exiting startup logic at 54.8% of startup sequence, Expected fps: 24.981185, calculated fps at this point: 24.980789, residual error: 635.0 ns, sample interval: 4096. It should exit as late as possible, with the fps values as close as possible, and the residual error as small as possible.
During the run: Block's max frame age: 0.171 seconds. Run's dropped frames: 0. The age should be small. If it exceeds a few seconds, it indicates the code is not keeping up with the frame rate.
Unexpected PTS value. Occasionally, a single GStreamer pts may get corrupted, prompting the code to initiate a disconnect/reconnect cycle. These should be rare – I encounter about one per week.
Too many resets. Disabling smoothing function! This indicates a fallback to raw pts.
Check config FPS! Startup sequence exited early, likely due to an inaccurate FPS value. Startup is disabled for the remainder of the run.
At the end of the run:
- If the run contained an uninterrupted sequence of at least 1 hour, and the fps values diverge significantly: Config file fps appears to be inaccurate. Consider updating the config file!
- In all cases: Last calculated FPS: 24.981216 at frame 631040, config FPS: 24.981185, resets: 0, startup status: True based on the longest uninterrupted sequence of the night. This calculated fps should be used to update the config file – at least once.

The intervals plot should show a flat line with no variations at the current scale (except for ms rounding errors until microseconds are implemented).

Overall, this is more complex than I would have liked. I’m sure I haven’t considered all possible edge cases, so I welcome your feedback and observations. Is buffereCapture.py becoming too bloated? Initially, I hadn’t anticipated the need for this complexity; otherwise, I might have separated all the media backend functionality into its own module. It’s still a feasible option, but I’m getting pressed for time.

Here's a microsecond interval plot sample - the residuals scale is 0.1 frame. US9999_20240313_061207_625958_ff_intervals

dvida commented 3 months ago

I updated a few little things and the style. One thing that I changed is BufferedCapture so that there is a global flag when gi is imported or not, and that is used everywhere later on instead of relying on the media_backend option. One final change I suggest is made is to remove all hardcoded values such as system_latency and device_buffer. These values work for these IMX IP cameras, but people are also using MIPI cameras for which this probably doesn't apply. I suggest putting all of this in the config reader module, but just keeping it at default values. Also, add them to the config file, but commented out so people don't mess with it if they don't have to and the default values are used if they are not set.

Also, is there anything preventing cameras from running at a different FPS? Say 10 or 50 FPS? Would that mess up the smoothing algorithm or any other assumptions?

Cybis320 commented 3 months ago

Thanks Denis, good suggestions. One day I'll learn to style things right (is there a style guide to follow?). The smoothing algorithm should work fine at any fps.

The startup sequence is a little janky. My goal was to find a lightweight method to detect whether the calculated fps converges reasonably well with the config fps. It goes through seven gates at which it checks for reasonableness. However, It's easily thrown off by noise. Worst case scenario, it falls back to calculate fps early which causes more variations in the first 10 minutes. It can all be tweaked as we gain experience.

I'll move the hardcoded variable out.

Cybis320 commented 3 months ago

I moved the hardcoded latency settings to the config file and adjusted the initVideoDevice function to return True, and exit on success. Otherwise, it was always running the new catch-all 'else' clause. (The initVideoDevice has a if > if > elif (not if > elif > elif) structure to allow fallback to cv2.)

dvida commented 3 months ago

Thanks! I'll take a look and merge ASAP. Here is the style guide I take inspiration from: https://developer.lsst.io/python/style.html

dvida commented 3 months ago

I was able to get the old Pi3's run well with the old CV2 pipeline, so they are not affected. But when I switch to a Pi with Python 3 and the one which should support the gstreamer pipeline, I'm seeing this:

2024/03/20 19:26:14-INFO-QueuedPool-line:204 - Using 1 cores
Set low priority for the LiveViewer thread!
2024/03/20 19:26:14-INFO-StartCapture-line:97 - Press Ctrl+C to stop capturing...
Trying to ping the IP camera...
PING 10.10.0.121 (10.10.0.121) 56(84) bytes of data.
64 bytes from 10.10.0.121: icmp_seq=1 ttl=64 time=0.640 ms

--- 10.10.0.121 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.640/0.640/0.640/0.000 ms
2024/03/20 19:26:15-INFO-BufferedCapture-line:560 - Camera IP ping successful!
2024/03/20 19:26:15-INFO-BufferedCapture-line:575 - Initializing the video device...
2024/03/20 19:26:15-INFO-BufferedCapture-line:576 - Device: rtsp://10.10.0.121:554/profile1
2024/03/20 19:26:15-INFO-BufferedCapture-line:585 - Initialize GStreamer Standalone Device.
0:00:00.112400310  1447  0x1d2ed60 ERROR                rtspsrc gstrtspsrc.c:8667:gst_rtspsrc_uri_set_uri:<rtspsrc0> Not a valid RTSP url 'None' (-2)
0:00:00.343160308  1447  0x1dae980 WARN                 rtspsrc gstrtspsrc.c:7456:gst_rtspsrc_retrieve_sdp:<rtspsrc0> error: No valid RTSP URL was provided
0:00:00.343373602  1447  0x1dae980 WARN                 rtspsrc gstrtspsrc.c:7544:gst_rtspsrc_open:<rtspsrc0> can't get sdp
0:00:00.343414916  1447  0x1dae980 WARN                 rtspsrc gstrtspsrc.c:5624:gst_rtspsrc_loop:<rtspsrc0> we are not connected
0:00:00.343490804  1447  0x1dae980 WARN                 rtspsrc gstrtspsrc.c:8082:gst_rtspsrc_play:<rtspsrc0> failed to open stream
0:00:00.343526508  1447  0x1dae980 WARN                 rtspsrc gstrtspsrc.c:5624:gst_rtspsrc_loop:<rtspsrc0> we are not connected
2024/03/20 19:26:15-INFO-BufferedCapture-line:516 - Start time is 2024-03-20 19:26:15.305610

And then it just hangs. Have you seen something like this? It looks like it doesn't like the Gst.init(None) call.

Cybis320 commented 3 months ago

Can you also post the config file camera settings. In any case, it should have reverted to cv2.

Cybis320 commented 3 months ago

I haven't seen that. But we made some changes to the device init function yesterday. Maybe something got sideways there.

dvida commented 3 months ago

So this one should be all set to run gstreamer if I understand correctly and it should not fall back to opencv. It has PyGObject installed.

The issue seems to be in extractRtspUrl, I'm diagnosing now...

dvida commented 3 months ago

Fixed, capture working! I'll report back tomorrow.

CroatianMeteorNetwork / RMS

Enhanced Timestamping Using GStreamer Presentation Timestamp (pts) #257

!/bin/sh

The script takes the path to the virtual environment directory as the argument

Configuration

Installation

Check output cmake, it should include python 2

For more information check: http://www.pyimagesearch.com/2016/10/24/ubuntu-16-04-how-to-install-opencv/