SamagraX-Labs / poc-tracker

0 stars 0 forks source link

Evaluate VOSK to run offline language assessments on mobile #9

Open GautamR-Samagra opened 9 months ago

GautamR-Samagra commented 9 months ago

To Do: https://docs.google.com/document/d/1fTSatDtD1sGI_YPChHw3iO7rsTDboBH8R_OHLC_AsvQ/edit

Phase 1:

Phase 2:

Phase 3:

Phase 4:

karntrehan commented 9 months ago

Nitin: Extensive on ground testing required. On different paragraphs. For example - Motorcycle. What if only motor is said?

100 words by 30 students each would be a good sample set. Also check noisy background. Also check if the student is not speaking the paragraph at all. We can also create sample audio with noise on the samples we collect.

Simple words - 2 letter words (Haan, Ha, Aam, and ek, do, teen etc) would be more difficult. Lets check on that.

Also check numbers. If the model is able to read this.

karntrehan commented 7 months ago

Feedback from Rahul: Allow accessor to map a word as wrong or right manually.

karntrehan commented 6 months ago

https://docs.google.com/spreadsheets/d/12KYbJSaZ6e1HDJvh3DRGq7E0f4tu8e42CP8CmwRxrsI/edit?usp=sharing to be used.

GautamR-Samagra commented 6 months ago

Next steps :

GautamR-Samagra commented 6 months ago

@karntrehan @rohitsamagra
Can we just have the PT put al the audio recordings into one folder on gdrive? They can use Excel to maintain what the transcript is for each of these file names in that folder. They can just keep adding to that folder as they get new wav files and keep updating the Excel.

@charanpreet-s Have created a collab for converting all formats to wav and base64 and saving them in that format.
Also transcribing all the audios using Conformer (Bhashini) to get better quality transcribed output. The provided 'answers' do not match the audio directly as the student repeats the words multiple times often to get it right.

New sample transcripts look like this - base64_and_transcripts.xlsx

GautamR-Samagra commented 6 months ago

@charanpreet-s @rohitsamagra Have created a folder where they can upload the files here

Have uploaded all provided files already there.

Have created another folder here which has the wav files in the required format for Vosk. (this is updated by my code) You can use these files for easy testing if required.

For current files, have updated accuracy in this sheet here

Collab to use vosk on wav files and get accuracy is here