Open ChakshuGautam opened 2 weeks ago
https://github.com/suyashgautam to guide Sarvesh.
cc @ChakshuGautam cc @GautamR-Samagra
Format to hit autotune api for force aligning audios in specific length
First create a workflow in autotune, and extract workflow id from it
curl --location 'https://autotune.dev.bhasai.samagra.io/v2/workflow/create/' \
--header 'User-Id: 0297f861-cc97-4b27-b464-ed826dbda7eb' \
--header 'role: user' \
--header 'Content-Type: application/json' \
--data '{
"config":{
"config_name": "QnA",
"system_prompt": "You are a helpful data generation assistant working as a teacher. You are an expert in this field. Don'\''t Hallucinate.",
"user_prompt_template": "{{workflow.user_prompt}}",
"temperature":1,
"schema_example": {
"question": "4 + 5",
"answer":"9"
}
},
"workflow": {
"workflow_name": "Data Analysis Workflow",
"total_examples": 100,
"split": [
80,
10,
10
],
"user_prompt":"Generate questions to test addition and substraction for grade 1 students. Your task is to generate 5 addition questions and 5 subtraction questions with single digits numbers.",
"llm_model": "gpt-3.5-turbo-0125",
"tags": [
"data analysis",
"machine learning"
]
}
}'
For hitting force alignment end point, dataset which we want to align must be on hugging face and format of dataset must be ----
- audio_1.wav
- audio_2.wav
- .
- .
- transcription.txt
format of of transcription.txt must be like this
audio name and space separated transcript
After creating workflow you we can force align using
curl -X POST -H "Content-Type: application/json" -d '{"dataset":"xorsuyash/asr_datasetp2","workflow_id":"b23fe059-e941-4045-ad6c-bf9330e88455","save_path":"SamagraDataGov/asr_dataset_test_p9","transcript_available":"true","time_duration":5.0}' https://autotune.dev.bhasai.samagra.io/v1/workflow/force-align
Here
TODO
wav
file to Huggingface DatasetTechnology
Specifications
<sessionID>.<length>.<original/modified>.<wav/txt>
sessionID
,startTime
,endTime