Documenting how to access Bhashini Speech to text - Githubissues

Samagra-Development / ai-tools

AI Tooling to bootstrap applications fast

44 stars 110 forks source link

Documenting how to access Bhashini Speech to text #282

Closed Gautam-Rajeev closed 10 months ago

Gautam-Rajeev commented 10 months ago

Documentation for using Bhashini models is provided here

You need to do the following :

Create an account on Bhashini ULCA
Sign in and create an API key
Convert your audio file into 'base64' string format
Send to the base 64 to the API and use the output

I have also created a collab here with an example of the same. You need to provide your own API key in the collab

Sign up here
Fill out the registration form.
Complete email authentication to enable login functionality.
Log in using your authenticated email.
Open the “My Profile” section.

Create an API Key using the “Generate” button under the “My Profile” section. Ensure that your app name uses lowercase words and underscores.
Use the API provided in my collab to convert wav file to bas64 and run
In the collab, I have combined the pipeline APIs that is used to get the authorization and 'model to hit' with the ASR model to quickly run ASR. I have also added batching to enable it to run for bigger wav files