SlangLab-NU / VoiceCollector

Apache License 2.0
1 stars 1 forks source link

Disable speech recognition #18

Closed aanchan closed 11 months ago

aanchan commented 11 months ago

Aug 11, 2023: Edit this code now depends on Minio

Refer to this comment and look at the associated PR to get Minio installed:

TODO: README needs updating.

I have disabled speech recognition by deleting some code that does transcription, and then providing some quality metrics on the audio files. This needed modification of the schema and tables as well. I have attached evidence below of testing for a successful run.

These are the logs from my backend code:

flask run --debug
 * Serving Flask app '__init__.py'
 * Debug mode: on
Level: INFO
Time: 2023-08-10 10:17:19,191
Logger: werkzeug
Path: _internal:224
Function :_log
Message: WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
 * Running on http://127.0.0.1:5000

Level: INFO
Time: 2023-08-10 10:17:19,191
Logger: werkzeug
Path: _internal:224
Function :_log
Message: Press CTRL+C to quit

Level: INFO
Time: 2023-08-10 10:17:19,192
Logger: werkzeug
Path: _internal:224
Function :_log
Message:  * Restarting with stat

Level: WARNING
Time: 2023-08-10 10:17:19,431
Logger: werkzeug
Path: _internal:224
Function :_log
Message:  * Debugger is active!

Level: INFO
Time: 2023-08-10 10:17:19,438
Logger: werkzeug
Path: _internal:224
Function :_log
Message:  * Debugger PIN: 140-917-433

Level: INFO
Time: 2023-08-10 10:18:33,018
Logger: werkzeug
Path: _internal:224
Function :_log
Message: 127.0.0.1 - - [10/Aug/2023 10:18:33] "GET /api/v1/speak/get_reference HTTP/1.1" 200 -

Level: INFO
Time: 2023-08-10 10:18:33,021
Logger: werkzeug
Path: _internal:224
Function :_log
Message: 127.0.0.1 - - [10/Aug/2023 10:18:33] "GET /api/v1/speak/get_reference HTTP/1.1" 200 -

Level: INFO
Time: 2023-08-10 10:18:58,321
Logger: speak
Path: length_check:9
Function :get_audio_length
Message: duration (s): 3.54

0000000000000000000000000000000000000001111111111+(1.1700000000000008)11111111111111111111111111111111111111111111111111111111111111111110-(3.509999999999992)
Level: INFO
Time: 2023-08-10 10:18:58,324
Logger: speak
Path: silence_check:165
Function :get_silence_ratio
Message: Voiced frames count: 77

Level: INFO
Time: 2023-08-10 10:18:58,324
Logger: speak
Path: silence_check:166
Function :get_silence_ratio
Message: Total frames count: 117

Level: INFO
Time: 2023-08-10 10:18:58,324
Logger: speak
Path: silence_check:168
Function :get_silence_ratio
Message: Silence ratio: 0.34

Level: INFO
Time: 2023-08-10 10:18:59,449
Logger: werkzeug
Path: _internal:224
Function :_log
Message: 127.0.0.1 - - [10/Aug/2023 10:18:59] "POST /api/v1/speak/submit/test10.874510967740258.webm HTTP/1.1" 200 -

This is my screenshot from the front end

Screen Shot 2023-08-10 at 11 20 59 AM

This is the file metadata being recorded in the sqlite database.

Screen Shot 2023-08-10 at 11 22 08 AM

These are the sqlite tables:

Screen Shot 2023-08-10 at 11 23 28 AM

Audio file uploaded to S3:

Screen Shot 2023-08-10 at 11 27 35 AM