Closed banjtheman closed 3 years ago
Have created a script (https://github.com/CharlotteJackson/DC_Crash_Bot/blob/audio_transcirbe/scripts/transcribe_audio.py) that does the following...
Not sure where talkgroup is mapped in openmhz - In the "talk group" key of the API response. We're interested in talk group 101 (dispatch) and 728/729 (EMS 5 and 6) Need to get an estimate of how many calls we will transcribe, 60 free minutes a month on cloud services -500 car crash calls a month give or take, say each dispatch call is 30 seconds Perhaps we can use AWS and Google to get 120 minutes of free transcription Is it possible to geotag calls? -hopefully we can map this data to the Pulsepoint API using call time and unit numbers, which has the geotag Lots of calls are short, is there value in transcribing these calls? -probably not - dispatch is going to be most important How often do we want to check for calls? -scrape say once an hour? How long are calls stored in opehnmz? -For the past 30 days
Made update, can run the following workflow
The next steps will be
Here is an example output
{
"id": "609c91e7c565b14d6ccb05f3",
"source": 101,
"audio_url": "https://s3.us-east-2.wasabisys.com/openmhz/media/dcfd-101-1620873678.m4a",
"timestamp": "2021-05-13T02:41:18.000Z",
"call_length": 19,
"transcribed_audio": "Medical Local 26 respond to L. S. Person down 14th Rhode Island Avenue Northeast offered on channel 0 11. Medical. Local 26 respond to L. S. A. Person down 14 to Rhode Island Avenue Northeast station will be in a black escalade 7 11 parking lot operate on channel 0 11. At 22 41."
},
whooooo hooo we got it running! :)
What is the Task
We want to be able to transcribe audio files from openmhz
Why do we want to do this
In order to capture radio data
How can I get started?
TODO
Definition of Done
Transcribed audio data is stored in the database