PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.05k stars 1.84k forks source link

Format not recognised #3644

Open Phalanx-01 opened 10 months ago

Phalanx-01 commented 10 months ago

While trying to upload a wav file per Webrequest im getting the following error

{"error":"Error opening '/tmp/Goodevening.wav': Format not recognised."}

my api_service.py looks like

from flask import Flask, request, jsonify import os import logging from paddlespeech.cli.st.infer import STExecutor import tempfile import shutil

app = Flask(name) logging.basicConfig(level=logging.INFO)

Initialize the Speech Translation model

st_executor = STExecutor()

@app.route('/translate', methods=['POST']) def translate_audio(): if 'file' not in request.files: return jsonify({'error': 'No file part'}), 400

file = request.files['file']

if file.filename == '':
    return jsonify({'error': 'No selected file'}), 400

if file and allowed_file(file.filename):
    # Save the file to a temporary location
    temp_dir = tempfile.mkdtemp()
    filename = os.path.join(temp_dir, file.filename)
    file.save(filename)

    try:
        # Translate the audio
        result = st_executor(audio_file=filename)
        return jsonify({'translation': result})
    except Exception as e:
        logging.error(f"Error during processing: {e}")
        return jsonify({'error': str(e)}), 500
    finally:
        # Clean up temporary files
        shutil.rmtree(temp_dir)
else:
    return jsonify({'error': 'Invalid file format'}), 400

def allowed_file(filename): return '.' in filename and filename.rsplit('.', 1)[1].lower() in {'wav'}

if name == 'main': app.run(host='0.0.0.0', port=5000)

any ideas how to overcome that issue ?
zxcd commented 9 months ago

Check the format and sample rate of this audio.