This PR introduces a series of scripts to pre-process audio data from Praat annotations. The main functionalities include:
audio/fix_audio_header.py: Fills in missing information in the header of the audio files to fix them.
audio/extract_vocalic_features.py: Uses openSMILE to extract vocalic features form audio files and save them in .csv files.
audio/standardize_silence_periods.py: Unifies utterances separated by less than a specified amount of seconds in Praat annotation files. The merged utterances are annotated in a new tier in the generated Praat annotation files.
audio/transcribe_utterances.py: Generates transcriptions using Whisper for audio files using Praat annotation files as a guide to find audio segments with sound. The transcriptions are annotated as a new tier in the generated Praat annotation files.
audio/label_transcriptions.py: Generates labels by calling the Dialog Agent's API using Praat annotation files with transcriptions as a guide (item above). The labels are annotated as a new tier in the generated Praat annotation files.
This PR introduces a series of scripts to pre-process audio data from Praat annotations. The main functionalities include:
.csv
files.