speechmatics / speechmatics-js-sdk

Javascript and Typescript SDK for Speechmatics
MIT License
39 stars 6 forks source link

Request taking far too long for a small file #31

Closed Sawyeraltman closed 8 months ago

Sawyeraltman commented 9 months ago

Describe the bug The request is taking ~10s for a small audio file.

To Reproduce Steps to reproduce the behavior:

I am running a Firebase function in the emulator. Note that I am writing a wav in base64 to file, but when I run a locally stored one (the same file, and also a totally separate sample wav file), I get the same deal. Yet the same files through the web client finish in 1s.

const sttSpeechmatics = onCall(async (request) => { const audioUrl = request.data; const cleanedBase64String = audioUrl.replace(/(\r\n|\n|\r)/gm, "");

const binaryData = Buffer.from(cleanedBase64String, 'base64');

// Define a path for the output file
// For example, saving in a 'temp' directory within your current working directory
const tempDir = path.join(__dirname, 'temp');
const outputPath = path.join(tempDir, 'audioFile.wav');

// Ensure the 'temp' directory exists
if (!fs.existsSync(tempDir)) {
    fs.mkdirSync(tempDir, { recursive: true });
}

// Write the binary data to a file
fs.writeFileSync(outputPath, binaryData);

const inputFile = new Blob([
    fs.readFileSync(outputPath),
]);

const config = {
    language: "en",
    // enable_entities: true,
    operating_point: "standard",
    additional_vocab: [
        {
            content: "w/4",
            sounds_like: ["w over 4"],
        },
        {
            content: "-5t",
            sounds_like: ["negative 5 t"],
        },
        {
            content: "<",
            sounds_like: ["less than"],
        },
        {
            content: ">",
            sounds_like: ["greater than"],
        },
        {
            content: "|x + 4|",
            sounds_like: ["the absolute value of x + 4"],
        },
        {
            content: "=",
            sounds_like: ["equals"],
        },
        {
            content: "y=mx+b",
            sounds_like: ["y equals m x plus b"],
        },
    ],
};

const sm = new Speechmatics(API_KEY);
try {
    const transcriptText = await sm.batch.transcribe({
        input: inputFile,
        transcription_config: config,
        format: "text",
    });

    console.log({transcriptText});
    blob = undefined

    console.log('Transcript is done')
    return transcriptText;
} catch (error) {
    console.error(error);
}

});

Expected behavior The request should come back as fast as it does in the web client.

nickgerig commented 9 months ago

Hi @Sawyeraltman

Would be good to reduce this to the most simple test case (without base64 writing and additional_vocab which does have an incremental overhead on transcribe time).

The web client time of 1s you mention is not a reflection of the transcribe time for a 10s file. It might reflect the file upload time, but the job itself will take additional x seconds to process.

If you can provide a test file I can give you some additional metrics.

Just as an aside the web client also uses the JS SDK under the hood, albeit the browser client.