It has inherited some enhanced features for sensevoice:
logprob
to the recognition results to represent the confidence of the recognition, for use by upper-level applications.First, clone this repository to your local machine:
git clone https://github.com/0x5446/api4sensevoice.git
cd api4sensevoice
Then, install the required dependencies using the following command:
conda create -n api4sensevoice python=3.10
conda activate api4sensevoice
conda install -c conda-forge ffmpeg
pip install -r requirements.txt
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the FastAPI app with a specified port.")
parser.add_argument('--port', type=int, default=7000, help='Port number to run the FastAPI app on.')
parser.add_argument('--certfile', type=str, default='path_to_your_certfile', help='SSL certificate file')
parser.add_argument('--keyfile', type=str, default='path_to_your_keyfile', help='SSL key file')
args = parser.parse_args()
uvicorn.run(app, host="0.0.0.0", port=args.port, ssl_certfile=args.certfile, ssl_keyfile=args.keyfile)
The above code is from the end of server.py. You can modify it to define the port, certfile, and keyfile, then directly run python server.py to start the API service.
You can also set these through command-line arguments, for example:
python server.py --port 8888 --certfile path_to_your_certfile --keyfile path_to_your_key
Path: /transcribe
Method: POST
Summary: Transcribe audio
Request Body:
multipart/form-data
file
(required): The audio file to transcribeResponse:
application/json
code
(integer): state numberinfo
(string): meta infodata
(object): Response objectRequest Example:
curl -X 'POST'
'http://yourapiaddress/transcribe'
-H 'accept: application/json'
-H 'Content-Type: multipart/form-data'
-F 'file=@path_to_your_audio_file'
{
"code": 0,
"msg": "Success",
"data": {
// Transcription result
}
}
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the FastAPI app with a specified port.")
parser.add_argument('--port', type=int, default=27000, help='Port number to run the FastAPI app on.')
parser.add_argument('--certfile', type=str, default='path_to_your_certfile', help='SSL certificate file')
parser.add_argument('--keyfile', type=str, default='path_to_your_keyfile', help='SSL key file')
args = parser.parse_args()
uvicorn.run(app, host="0.0.0.0", port=args.port, ssl_certfile=args.certfile, ssl_keyfile=args.keyfile)
The above code is from the end of server_wss.py. You can modify it to define the port, certfile, and keyfile, then directly run python server_wss.py to start the WebSocket service.
You can also set these through command-line arguments, for example:
python server_wss.py --port 8888 --certfile path_to_your_certfile --keyfile path_to_your_key
If you want to enable speaker verification:
reg_spks_files = [
"speaker/speaker1_a_cn_16k.wav"
]
code
(integer): state numberinfo
(string): meta infodata
(object): Response objectclient_wss.html
wsUrl
to your own WebSocket server address to test
ws = new WebSocket(`wss://your_wss_server_address/ws/transcribe${sv ? '?sv=1' : ''}`);
All forms of contributions are welcome, including but not limited to:
This project is licensed under the MIT License. See the LICENSE file for details.