Plot, score, and log timestamp intervals for analysis

Cybis320 commented 5 months ago

This PR introduces PlotTimeIntervals.py, a utility that generates a plot displaying the intervals between FF file timestamps. It also assigns a score ranging from 0 to 1000, with higher scores indicating less variability in the timestamps.

During reprocessing, this utility is invoked on the night_dir, logging a score and incorporating it into the plot's filename. If insufficient intervals are present to calculate a meaningful score, none is returned. The plot and score aim to function as diagnostic tools for identifying timestamping and dropped frame issues.

The utility uses Pandas to find outliers, and pandas has been added to requirements.txt. If this is a pain point, the utility can be redisigned without this feature.

The scoring is somewhat arbritary and designed to offer what feels like the right sensitivity. What are your thoughts?

Edit: Denis, thank you for suggesting to use tar.bz2 - I've incorporated that suggestion.

Cybis320 commented 5 months ago

Sample: US9999_20240107_144910_590515_intervals_score_978

Cybis320 commented 5 months ago

And it can be used as a cmd line utility on existing archived folders: python -m Utils.PlotTimeIntervals "input_path" "fps"

Cybis320 commented 5 months ago

Updated the PlotTimeIntervals script to recursively search through all subdirectories of a given path for FS*.tar.bz2 files. For each matched file, the script now generates a plot.

You can invoke the updated script like this:

python -m Utils.PlotTimeIntervals /home/pi/RMS_data/ArchivedFiles 25

This command will process all suitable files in /home/pi/RMSdata/ArchivedFiles and its subdirectories, assuming a frame rate of 25 fps for the analysis.

dvida commented 5 months ago

Thanks! Could you also strip the f strings and use argparse (see the bottom of e.g. Utils.StackFFs) in the main section?

Cybis320 commented 5 months ago

Will do. Re pandas / outliers, yay or nay?

dvida commented 5 months ago

Oh, I haven't noticed pandas there. I'm very reluctant about adding new libraries, remote installation can break setups in many different ways on the many machines people operate it on. I'd rather drop it if that's OK.

Cybis320 commented 5 months ago

Easy for this one. The GStreamer based bufferedCapture requires a new dependency though...

dvida commented 5 months ago

That right, that's why we need to make the old behaviour the default one (perhaps try to fix it with the cv2 version as best as we can), and the gstreamer will only be rolled out in the new versions of the SD card image. We also need to be careful that all the new imports are conditional and don't break under existing installations.

Cybis320 commented 5 months ago

Updated the PlotTimeIntervals script with the following changes:

Removed usage of f-strings and Pandas.
Integrated argparse.
The script now attempts to read the .config file in the same directory as the .tar.bz2 file to determine the fps value. If .config is not found or doesn't contain fps, the script falls back to using the fps value provided as a command-line argument or 25 if no argument is provided.

You can run the updated script with the following command:

python -m Utils.PlotTimeIntervals /home/pi/RMS_data/ArchivedFiles --fps 25

This command processes all subdirectories in /home/pi/RMS_data/ArchivedFiles, using the fps value from the subdirectories .config files when available or defaulting to 25 fps if not.

During reprocessing, this utility is invoked automatically on the night_dir, logging a score and incorporating it into the plot's filename.

dvida commented 4 months ago

Amazing work! And maybe just one final nitpick - could we make the score a physically meaningful number? E.g. average late time in ms. Or the percentages of FF files which are more than 1 frame early and 1 frame late than the nominal time. It would also be good to mark those lines on the plot. The idea behind this is that it's immediately obvious to operators what the number means so they don't have to consult additional documentation or they act on a wrong assumption. Adding a short explanation directly in the plot is encouraged.

Cybis320 commented 4 months ago

Thanks, Denis, for all the great suggestions, please keep them coming! I experimented with the percentage of files within ±1/fps of the target, where 100 is perfect. See attached example, let me know what you think.

The expected and average lines are already in the legend. Did you mean to label the lines themselves? US9999_20231221_011659_875432_detected_intervals_score_59

dvida commented 4 months ago

That looks great, thanks! I meant that the +/-1 frame lines should be plotted around the expected value, but I guess they would be too close together to see in the plot. One thing that I noticed is that it looks like each outlier has a pair on the other side. Is this real or just a coincidence? If it's real it would mean that RMS compensates with the next frame to bring the timing back into some order. What if you applied a 3-frame moving average over the numbers? That might give us a better idea of systematic offsets vs short-term jitter? I'd leave individual points and overplot the moving average.

Cybis320 commented 4 months ago

Here's a bad SD card example: US9999_20231205_005351_882274_intervals_score_36

Cybis320 commented 4 months ago

Here a GStreamer example: US9999_20240201_064946_820370_detected_intervals_score_100

Cybis320 commented 4 months ago

Upper and lower interval line. Are the labels ok? Regarding the rebound in interval, yes, I think there is a return to average effect. But it does mean that those timestamps are less accurate. In the bad-SD-card example, there is no return to average there. I'll play with moving avg next.

US9999_20231221_011659_875432_detected_intervals_score_59

Cybis320 commented 4 months ago

Maybe it needs two scores to differentiate jitter from dropped frames...

Cybis320 commented 4 months ago

Ok, moving average with long window (~10) is promising. I still need to work on it, but here are some preliminary results:

Here's CV2 with a good SD card: US9999_20231217_072712_688922_detected_intervals_score_49

And CV2 with a bad SD card: US9999_20231205_005351_882274_intervals_score_36

Cybis320 commented 4 months ago

Directly plotting the moving average doesn't work well. How about color coding the scatter points based on whether on not they fall in a time period where the moving average exceed a certain threshold. The above graphs would then look like this: US9999_20231217_072712_688922_detected_intervals_score_49 US9999_20231205_005351_882274_intervals_score_36

Cybis320 commented 4 months ago

Latest iteration:

Latest iteration: The score has been renamed to Jitter Score.
Added a Dropped Frames score, which estimates the percentage of FF files likely to contain dropped frames. This is achieved by calculating a moving average over 50 FF files; any FF file with a timestamp during a time when the moving average exceeds 257/fps is tagged. The score is the fraction of untagged intervals over the total intervals. 1A score of 100% indicates that dropped frames are unlikely, while 0% means that all FF files contain at least one dropped frame. Intervals that are tagged are colored red. This method is not 100% accurate, as there is no perfectly accurate way to detect dropped frames with CV2.
Tweaked the layout, legend, labels, etc.
Files are now saved with both scores.

I need to clean up the code before pushing it, but here are some sample plots:

US9999_20231217_072712_688922_detected_intervals_scores_49-100 US9999_20231205_005351_882274_intervals_scores_36-37

dvida commented 4 months ago

This is amazing, great work! I think we're 99% there. Here are a few suggestions

Can you plot the median instead of the average interval? This will remove the focus from the outliers to the actual average performance.
Make the +/- 1 frame lines a different colour for better visibility.
The scoring system is excellent. For clarity, I suggest the following label changes to indicate that higher is better for the jitter score. Also, I suggest changing the dropped frame score to 100% - score so anything to do with dropped frames being around 0 is good. I also suggest concisely quantifying the idea behind each score:
- Jitter Quality (intervals within +/-1 frame): 100%
- Dropped Frame Score (intervals >2 frames late within 50 FF files): 0%
Can the image size go down? The plots are over half a meg now, I'd like to see it go < 100 kb, ideally < 50 kb.

Cybis320 commented 4 months ago

All great observations and suggestions as usual. I agree with the idea to make the dropped frame number a lower-is-better value. I tend to associate 'score' with higher-is-better... Dropped Frame Rate? Lower the size?? But, I was going to frame this artwork! Hehe. I'll play with figure size, dpi and jpg compression to see how low I can go without losing readability.

dvida commented 4 months ago

Dropped Frame Rate

Sold!

Cybis320 commented 4 months ago

Mean, Median, Mode - make your choice! :) US9999_20231221_011659_875432_detected_intervals_scores_59-0

dvida commented 4 months ago

Thanks! Could you double-check the way the median is computed? It's very strange that it's so far off from the other two values.

Cybis320 commented 4 months ago

100 dpi > 107 KB 150 dpi > 190 KB 200 dpi > 289 KB 250 dpi > 409 KB They look the same in the browser, but look incrementally better when opened to full size.

100 dpi > 107 KB: US9999_20231221_011659_875432_detected_intervals_scores_59-0_100dpi

150 dpi > 190 KB: US9999_20231221_011659_875432_detected_intervals_scores_59-0_150dpi

200 dpi > 289 KB: US9999_20231221_011659_875432_detected_intervals_scores_59-0-200dpi

250 dpi > 409 KB: US9999_20231221_011659_875432_detected_intervals_scores_59-0_250dpi

Cybis320 commented 4 months ago

Good catch on the median! It helps to sort the list first... US9999_20231221_011659_875432_detected_intervals_scores_59-0

Cybis320 commented 4 months ago

US9999_20231205_005351_882274_intervals_scores_36-63

dvida commented 4 months ago

np.median is the fastest way btw, if you're doing anything else :) np.mean for the mean, etc. Let's just keep the median to make the plot less busy.

You can try setting a small figsize perhaps that will help. I find it usual that the png's are that big while other PNGs that RMS produced are quite small even though they're also quite busy. 150 dpi is probably the sweet spot

dvida commented 4 months ago

Oh! And how about showing the median and the standard deviation of the scatter? E.g. 10.248 +/- 0.05 s / 1.1 frame. That way the data is fully characterized.

Cybis320 commented 4 months ago

Like so? US9999_20231221_011659_875432_detected_intervals_scores_59-0

dvida commented 4 months ago

<chef's kiss> Do you have anything else planned or can I test and deploy?

Cybis320 commented 4 months ago

I think it’s ready to test. Fingers crossed.On Feb 7, 2024, at 2:37 PM, Denis Vida @.***> wrote: <chef's kiss> Do you have anything else planned or can I test and deploy?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

dvida commented 4 months ago

NL000M_20231215_161435_236036_ff_intervals

I made a few little tweaks I'll push now, but it works great! This is what I see for a normal camera. Most look like this or have a handful of longer dropouts in the night. But the bulk of the points are well within the 1-frame boundary.

UK00BA_20231227_164748_034400_ff_intervals

Cybis320 commented 4 months ago

Is there any alerting system that could send you and/or the operator messages if things go haywire ?

g7gpr commented 4 months ago

Great work, can deploy on a few stations tonight, including some ones that I suspect have problems.

Cybis320 commented 4 months ago

And to analyze past performance, one can also run the utility as a command line on the ArchivedFiles folder and it will generates plots on all sub folders.

python -m Utils.PlotTimeIntervals /home/pi/RMS_data/ArchivedFiles

g7gpr commented 4 months ago

Debian platform, multiple camera, one username per camera.


(vRMS) au0028@pemberton:~/source/RMS$ python -m Utils.PlotTimeIntervals /home/au0028/RMS_data/ArchivedFiles
Loading config file: /home/au0028/RMS_data/ArchivedFiles/AU0028_20240204_114635_244720/.config
Processing /home/au0028/RMS_data/ArchivedFiles/AU0028_20240204_114635_244720/FS_AU0028_20240204_114635_244720_fieldsums.tar.bz2
Loading config file: /home/au0028/RMS_data/ArchivedFiles/AU0028_20240205_114542_005307/.config
Processing /home/au0028/RMS_data/ArchivedFiles/AU0028_20240205_114542_005307/FS_AU0028_20240205_114542_005307_fieldsums.tar.bz2
Loading config file: /home/au0028/RMS_data/ArchivedFiles/AU0028_20240206_114445_852284/.config
Processing /home/au0028/RMS_data/ArchivedFiles/AU0028_20240206_114445_852284/FS_AU0028_20240206_114445_852284_fieldsums.tar.bz2
Loading config file: /home/au0028/RMS_data/ArchivedFiles/AU0028_20240207_115627_393724/.config
Processing /home/au0028/RMS_data/ArchivedFiles/AU0028_20240207_115627_393724/FS_AU0028_20240207_115627_393724_fieldsums.tar.bz2
Loading config file: /home/au0028/RMS_data/ArchivedFiles/AU0028_20240207_114349_406279/.config
Processing /home/au0028/RMS_data/ArchivedFiles/AU0028_20240207_114349_406279/FS_AU0028_20240207_114349_406279_fieldsums.tar.bz2
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/au0028/source/RMS/Utils/PlotTimeIntervals.py", line 267, in <module>
    plotFFTimeIntervals(root, fps)
  File "/home/au0028/source/RMS/Utils/PlotTimeIntervals.py", line 137, in plotFFTimeIntervals
    combined_condition = above_threshold & above_expected_interval
ValueError: operands could not be broadcast together with shapes (83,) (17,)

g7gpr commented 4 months ago

The capture directory which caused the error was terminated shortly after capture started, this is why there were two captures on the same night. Would you like a copy of that directory?

g7gpr commented 4 months ago

AU0028_20240204_114635_244720_ff_intervals AU0028_20240205_114542_005307_ff_intervals AU0028_20240206_114445_852284_ff_intervals AU0028_20240207_115627_393724_ff_intervals

g7gpr commented 4 months ago

au000u@pioneer:~/source/RMS$ source ~/vRMS/bin/activate
(vRMS) au000u@pioneer:~/source/RMS$ python -m Utils.PlotTimeIntervals ~/RMS_data/ArhivedFiles/
(vRMS) au000u@pioneer:~/source/RMS$ python -m Utils.PlotTimeIntervals ~/RMS_data/ArchivedFiles/
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240123_114744_582540/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240123_114744_582540/FS_AU000U_20240123_114744_582540_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240125_114644_582696/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240125_114644_582696/FS_AU000U_20240125_114644_582696_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240204_114011_242936/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240204_114011_242936/FS_AU000U_20240204_114011_242936_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240201_114224_817694/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240201_114224_817694/FS_AU000U_20240201_114224_817694_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240124_114714_747828/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240124_114714_747828/FS_AU000U_20240124_114714_747828_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240131_114307_048976/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240131_114307_048976/FS_AU000U_20240131_114307_048976_fieldsums.tar.bz2
Loading config file: /home/au000u/RMS_data/ArchivedFiles/AU000U_20240203_063937_481003/.config
Processing /home/au000u/RMS_data/ArchivedFiles/AU000U_20240203_063937_481003/FS_AU000U_20240203_063937_481003_fieldsums.tar.bz2
/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/fromnumeric.py:3432: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/_methods.py:190: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/_methods.py:265: RuntimeWarning: Degrees of freedom <= 0 for slice
  ret = _var(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/_methods.py:223: RuntimeWarning: invalid value encountered in divide
  arrmean = um.true_divide(arrmean, div, out=arrmean, casting='unsafe',
/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/_methods.py:257: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/au000u/source/RMS/Utils/PlotTimeIntervals.py", line 267, in <module>
    plotFFTimeIntervals(root, fps)
  File "/home/au000u/source/RMS/Utils/PlotTimeIntervals.py", line 102, in plotFFTimeIntervals
    moving_avg = np.convolve(intervals_np, np.ones(ma_window_size), 'valid')/ma_window_size
  File "<__array_function__ internals>", line 180, in convolve
  File "/home/au000u/vRMS/lib/python3.9/site-packages/numpy/core/numeric.py", line 849, in convolve
    raise ValueError('v cannot be empty')
ValueError: v cannot be empty

AU000U_20240123_114744_582540_ff_intervals AU000U_20240124_114714_747828_ff_intervals AU000U_20240125_114644_582696_ff_intervals AU000U_20240131_114307_048976_ff_intervals AU000U_20240201_114224_817694_ff_intervals AU000U_20240204_114011_242936_ff_intervals

Interesting how the dropout occurs at about the same time each night!

g7gpr commented 4 months ago

I wonder if this is the camera automatic reboot?

python -m Utils.CameraControl SetAutoReboot Everyday,15

By the way, this is absolutely brilliant!

markmac99 commented 4 months ago

@dvida @Cybis320 looks good. Ran it on two of my production systems with no issues. It would be good to put this on the weblog, once its available.

Might be worth fixing the Y axis scale to a range of 0-20 so that you can easily compare between days. Some points could then be off the top of the range, but they could be artifically forced in-range for display purposes. I thought UK002F was much worse than UK0006 on 07/dec but it turned out that the scales were different (8-12 vs 0-20s! ). 2F was actually better but because of the scale change, the scatter was more evident and at a casual glance i mistook it for worse data.

Do we have a view on what a "bad" plot would look like? Would be useful to add a comment if we think the plot requires investigation. More than 5% possible dropped frames ? Jitter worse than 80%?

UK0006 produced one day out of five where jitter quality dropped below 90% and poss dropped frames rose above 4%. UK002F also produced one day (different to 6) where JQ dropped below 90% but dropped frames was zero on that day. Neither camera seems to be generating bad data.

markmac99 commented 4 months ago

Graphs in case of interest UK0006_20240203_173148_513170_ff_intervals UK0006_20240204_173331_412120_ff_intervals UK0006_20240205_173514_827221_ff_intervals UK0006_20240206_173658_910097_ff_intervals UK0006_20240207_173842_246313_ff_intervals UK002F_20240203_173148_041874_ff_intervals UK002F_20240204_173330_896873_ff_intervals UK002F_20240205_173514_634923_ff_intervals UK002F_20240206_173658_558385_ff_intervals UK002F_20240207_173843_059439_ff_intervals

markmac99 commented 4 months ago

@g7gpr david - yes, that sets the camera to auto-reboot at 1500 according to its internal timestamp. The camera takes a few milliseconds to reboot so there should be little data loss (maybe one or two frames). The next line sets the camera's internal clock to the current time on the computer from which you run the script.

python -m Utils.CameraControl CameraTime set

My original instructions on the wiki instructed the user to run this script on their pc which ensured that the camera picked up your local time so that 1500 was well outside potential data capture times. Alternatively you can set it to any arbitrary time:

python -m Utils.CameraControl CameraTime set 20240207_012334

would set it to 01:23:34 on the 7th of Feb 2024

Cybis320 commented 4 months ago

The capture directory which caused the error was terminated shortly after capture started, this is why there were two captures on the same night. Would you like a copy of that directory?

Thank you for the report. The code was attempting to compute the moving average when there was not enough data to do so. I just pushed a fix for that.

dvida commented 4 months ago

I agree with Mark that some fixed limits should be used. I added a plot with residuals from the expected below the main plot with fixed limits to +/- 2 frames. This way even if there is one large outlier, the rest of the points will be visible well. UK00B8_20231225_164617_621911_ff_intervals UK00BA_20231227_164748_034400_ff_intervals AU0044_20240118_115030_630120_ff_intervals

Cybis320 commented 4 months ago

That's awesome!

Cybis320 commented 4 months ago

Showing off the GStreamer method 😄 US9999_20240208_013640_169265_ff_intervals

Cybis320 commented 4 months ago

Would you be opposed to add back some indication in the plot filename that something may be amiss? Appending a flag word when the dropped frame rate is not zero, or above a fairly low threshold. Right now, one would have to open and view each plot to discover an issue. It would be convenient to just search for the keyword to find potential issues. However, I understand if we prefer not to cause concern with a new feature. If we do add a flag, here are some options: _df (I think that wouldn't be very search friendly) _fault _dropped_frames _flagged etc.

dvida commented 4 months ago

The reason why I removed it is because all file names should be consistent and identical. The meteor stack isn't but that was a mistake made early on. However, I would fully support the addition of a report file with the numbers and a list of all intervals. If that is implemented, I suggest having both the read and write functions available.

The idea is then that the weblog code can go in and read in a bunch of info and generate a report and show a camera status summary.

CroatianMeteorNetwork / RMS

Plot, score, and log timestamp intervals for analysis #260