kode-git / nemova

Virtual Assistant based on NeMo Framework for generic task executions with vocal commands.
https://doi.org/10.6084/m9.figshare.20152850.v1
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Improvements for ASR module #7

Closed kode-git closed 2 years ago

kode-git commented 2 years ago

Improvements Methods

We can use the 3-gram modules from nemo_toolkit[asr] to avoid the mismatch between words meanings. It can avoid to have incomprehensible words in the transcription of an audio file during ASR steps and improve the understating of rasa NLU.

Steps

lm_gzip_path = '3-gram.pruned.1e-7.arpa.gz' if not os.path.exists(lm_gzip_path): print('Downloading pruned 3-gram model.') lm_url = 'http://www.openslr.org/resources/11/3-gram.pruned.1e-7.arpa.gz' lm_gzip_path = wget.download(lm_url) print('Downloaded the 3-gram language model.') else: print('Pruned .arpa.gz already exists.')

uppercase_lm_path = '3-gram.pruned.1e-7.arpa' if not os.path.exists(uppercase_lm_path): with gzip.open(lm_gzip_path, 'rb') as f_zipped: with open(uppercase_lm_path, 'wb') as f_unzipped: shutil.copyfileobj(f_zipped, f_unzipped) print('Unzipped the 3-gram language model.') else: print('Unzipped .arpa already exists.')

lm_path = 'lowercase_3-gram.pruned.1e-7.arpa' if not os.path.exists(lm_path): with open(uppercase_lm_path, 'r') as f_upper: with open(lm_path, 'w') as f_lower: for line in f_upper: f_lower.write(line.lower()) print('Converted language model file to lowercase.')

Output is an in-order array of tuples which the first element is the confidency of the "correct words" and the second one is the correct sentence.

Source: https://colab.research.google.com/github/NVIDIA/NeMo/blob/main/tutorials/asr/Offline_ASR.ipynb

kode-git commented 2 years ago

Done