Open Dygear opened 3 years ago
(copied from comment on the PR)
How about having the information passed on as an extra parameter (or parameters?) to the uploadScript? This way, you could do whatever you want with the data, from appending it to a log file to loading it into a database. As a plus, you could make sure the data is written once audio encoding is complete, and avoid cases of entries being consumed before the file is ready (the Web page won't link to it if it's not there at the moment).
As I said in my original comment, if I were creating the project, I'd have all data writing be external, and the default uploadScript be something like (<encode $1> && <write $2 ($3, $4, etc?) to daily log or JSON> && <curl -header="$2(,$3,$4,etc)" $1 uploadServer>) || <write to error log>
This is great @Dygear - there is so much interesting data you can pull from being able to do this higher level analysis. I am fully on board with making it easier to direct all this metadata into good data stores.
I moved over MimoSDR to use an SQLite database. I made a little script to convert the JSON files into a single database for me. It should be noted that I ended up writing about 1TB of data to my drive because of page write size on my computer, even tho the SQLite database it produced was only 800MB, I wrote to the same file about 8 million times with this script because it needs the index from the audio
table to fill out the rest of the information in the other tables.
MimoSDR.db (800MB SQLite Database) January 1st 2020 - June 3rd 2021 ~ 4 Million Audio Records
I had a good idea from @tannewt of using the BTRFS file system for the flat json files as it implements compression. As my accidental production server for this right now as a Raspberry Pi 4 8GB is getting co-lo'ed at End Office in Boston MA, I think the best way to handle this is to add 2x 512GB SanDisk Ulta Fit into the USB3 ports and Linux Software RAID that into a 1TB drive then make that the BTRFS partition as /mnt/audio
. That way I don't have to worry about trashing the Samsung Pro Endurance 128GB SD Card that's running the Web Server. That should maintain viability of the system even on very cheap hardware.
Mark,
On Monday, June 28, 2021, Mark Tomlin @.***> wrote:
I moved over MimoSDR to use an SQLite database. I made a little script to convert the JSON files into a single database for me. https://gist.github.com/Dygear/2bc9fb54855f45b61a782b56cec5fbb2 It should be noted that I ended up writing about 1TB of data to my drive because of page write size on my computer, even tho the SQLite database it produced was only 800MB, I wrote to the same file about 8 million times with this script because it needs the index from the audio table to fill out the rest of the information in the other tables.
Waaaaay cool!
I had a good idea from @tannewt https://github.com/tannewt of using the BTRFS file system for the flat json files as it implements compression. As my accidental production server for this right now as a Raspberry Pi 4 8GB is getting co-lo'ed at End Office in Boston MA, I think the best way to handle this is to add 2x 512GB SanDisk Ulta Fit into the USB3 ports and Linux Software RAID that into a 1TB drive then make that the BTRFS partition as /mnt/audio. That way I don't have to worry about trashing the Samsung Pro Endurance 128GB SD Card that's running the Web Server. That should maintain viability of the system even on very cheap hardware.
Keep up the great work!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/robotastic/trunk-recorder/issues/483#issuecomment-869995296, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGHQJCB7O6NOUIAZYZU2RTTVDIQXANCNFSM45U7V5YA .
This could be handled as a plugin. You could load the talkgroups from SQLite using the setup_system or setupsystems methods, and store the data using the signal, call or unit_ methods. The Talkgroups class should be expanded to have add method that takes all the parameters (or a Talkgroup class instance).
Sorry -- Apparently I fell of the end of the world around November.
Setting up Trunk-Recorder by using the SQLite database would be super cool. It would be better if I can tell it that it's This site and there for configures itself to only listen / track audio that is assigned to it. I am going to make an SQLite plugin for Trunk-Recorder. I think it's generally a good thing to have. The stats have been invaluable to me.
@robotastic Can you please assign this issue to me?
I started this in the https://github.com/robotastic/trunk-recorder/pull/482 where I had mentioned that I'm saving data for each recorded call into an SQLite database. Currently, I'm using the upload script to make a running version of this file as call are captured by the system. I think this is actually incredibly useful to have an SQLite database attached if for no other reason than running statistical analysis of the system over a much longer period of time. I was thinking that adding a config item for saving data to an sqlite database would be useful.
Something like a SQLite database that sits along side this might be a good idea. I'm currently saving all of the file information into an SQLite database to keep track of everything. It generally makes the query time for a page load much, much smaller.
I save each audio file into the
audio
table, and if there is sources and frequencies inside of thejson
file I save these out here as well with each item in that array getting a row inside the database. You just have to be sure to save the audio table entry first, because you need that insert ID (audioId
from theaudio
table) for theaudio_sources
and theaudio_frequencies
tables as they are foreign keys there. With all of this information in hand, you can track units (or radio IDs) across calls and return when a unit is talking. This is super useful for me when I'm dispatching as I can quickly go back and see what a unit has said when / if I miss it.But if you wanted to get all of the calls so far in the day for a page load, you could simply do ...
SELECT * FROM audio WHERE timeStart > strftime('%s', date('now') || ' ' || time('now', 'start of day'));
You can also count those rows for page setup and then keep a running tally on the page with a WebSocket event that +1 to the count each time a new audio clip comes in.
SELECT COUNT(freq) AS COUNT, freq FROM audio_frequencies GROUP BY freq;
This is good to see where you should be focusing your antenna band, or putting your best software defined radio. There is a fairly big spike in the number of calls handle by a small number of frequencies as you can see by this table.
852850000
,852925000
&853125000
. The data was collected from January 1st 2020 to July 1st 2020 and only really captures the talkgroups that I'm intrestred in. We could see with a more detailed look over the data that there might be an affity for a talk group to always capture near the same frequencies. This whole thing allows for a lot of intresting questions to be asked and hopefully answered.This raises some interesting questions.
So far my dataset is around 700,000 calls over that 6 month time table. There was a small bug in the parsing script that I made to take all of this data and about 24 hours into its code execution, it crashed because I think I was missing a json file and I didn't handle that case in the code. I'm going to restart it once I get to work for it to continue crunching on the data but I'll probably do it in day batches with commit to the database because there is simply so much data and I'm flogging my NVMe drive right now.
It would probably also be useful to have the recorder device saved as well, so we make a table for all of the recorders we've seen.
This allows us to profile a SDR over time, correlate the information with temprature, and freqency. Are we getting more spikes with with higher temps outside? What about when it rains? How about this recorder has had an increasinly higher rolling 24 hour avarge for spikes ... Maybe it's about to die and we need to replace it soon. All of these questions would be answered. All we actually need to do is save the
sdrId
and make it a forigen key inside of theaudio_frequencies
table -- Maybe also theaudio
table as well for conventinal transmissions but it would require more manual look at the audio file. But at least it's possible now!