Voice-to-Text is a Python application that leverages AI4BHARAT models to perform voice to text conversion using API calls. This project allows you to transcribe spoken words from audio recordings into written text, making it useful for various applications such as transcription services, voice assistants, and more.
To get started with Voice-to-Text, follow these installation steps:
Clone the repository to your local machine:
git clone https://github.com/your-username/voice-to-text.git
cd voice-to-text
Install the required Python packages using pip:
pip install -r requirements.txt
To run the Voice-to-Text application and make a POST request via the terminal, follow these steps:
Open your terminal or command prompt.
Navigate to the project directory:
cd path/to/voice-to-text
Start the Flask application by running the Python script:
python python_script.py
The app will start running with the host set to '127.0.0.1' and port set to 5000.
Open another terminal window or tab.
Use curl to make a POST request to your Flask API endpoint with the specified parameters, including the audio file:
curl -X POST -F "service_id=your_service_id" -F "src_lang_code=your_language_code" -F "audio_content=@path/to/your/audio_file.wav" http://localhost:5000/transcribe
Replace the following placeholders:
Press Enter to make the POST request.
This will send a POST request to your Flask API via the terminal, including the specified parameters and audio file. Your Flask server should process the request and return the transcript.
If you encounter any issues, please ensure that your Flask server is running on http://localhost:5000, and that you've followed the steps correctly.
Before using Voice-to-Text, you need to configure the API credentials. Follow these steps:
Sign up or log in to your AI4BHARAT account.
Generate API credentials, such as an API key or token, from your AI4BHARAT dashboard.
Open the config.json file located in the project directory.
Replace the placeholder values in config.json with your API credentials:
{
"api_key": "your_api_key_here",
"api_url": "https://api.ai4bharat.org/asr/v0.2/recognize"
}
Replace "your_api_key_here"
with your actual API key/token.
Save the config.json
file.
Download the Samaaja app
bench get-app https://github.com/fossunited/Samaaja
Install the app on your site
bench --site <your-site-name-here> install-app samaaja
For locatiing your Samaaja app follow the given steps :
Navigate to frappe folder, open frappe-bench.
Navigate to apps/saamaja
Open this folder in a VS Code-like editor.
To run the Voice-to-Text application in the samaaja interface and make a POST request via the terminal, follow these steps:
Inside the "Samaaja" folder, navigate to the "samaaja/api" folder.
Open the "voice_to_text" folder in the VS Code-like editor.
Locate the "frappe_script.py" file inside the "voice_to_text" folder.
Copy the "frappe_script.py" file.
Go back to the "api" folder (step 3), and paste the copied "frappe_script.py" file there.
Open the "frappe_script.py" file that you've just pasted into the "api" folder.
Inside the "frappe_script.py" file, locate the following lines:
API_KEY = "Your_API_KEY_here"
INFERENCE_URL = "Your_INFERENCE_URL_here"
Replace "Your_API_KEY_here" with your actual API key.
Replace "Your_INFERENCE_URL_here" with your actual inference URL.
Save the changes to the "frappe_script.py" file.
Open your terminal or command prompt.
Navigate to the project directory:
cd path/to/frappe-bench
Run frappe-bench:
bench start
Open another terminal window or tab.
Use curl to make a POST request to your Flask API endpoint with the specified parameters, including the audio file:
curl -X POST -F "service_id=your_service_id" -F "src_lang_code=your_language_code" -F "audio_url=http://example.com/path/to/your/audio_file.wav" http://127.0.0.1:8000/api/method/samaaja.api.frappe_script.transcribe_audio
Replace the following placeholders:
Press Enter to make the POST request.
This will send a POST request to your Frappe API via the terminal, including the specified parameters and audio file url. Your Frappe server should process the request and return the transcript.
If you encounter any issues, please ensure that your Frappe server is running on http://127.0.0.1:8000, and that you've followed the steps correctly.
We welcome contributions to improve Voice-to-Text. To contribute, follow these steps:
Fork the repository.
Create a new branch for your feature or bug fix:
git checkout -b feature/your-feature
Make your changes, test thoroughly, and ensure proper documentation.
Commit your changes with clear and concise messages.
Push your changes to your fork:
git push origin feature/your-feature
Create a pull request to the main repository's master
branch, describing your changes and their purpose.
This project is licensed under the GNU GENERAL PUBLIC LICENSE. See the LICENSE file for details.