Smart-Whisper is a native Node.js addon designed for efficient and streamlined interaction with the whisper.cpp, with automatic model offloading and reloading and model manager.
The standard installation supports Windows, macOS, and Linux out of the box. And it also automatically enables the GPU and CPU acceleration on macOS.
npm i smart-whisper
Support Matrix:
OS and Arch | CPU | GPU |
---|---|---|
macOS Apple Silicon | ✅ (Acceleration) | ✅ (Metal) |
macOS Intel | ✅ (Acceleration) | BYOL |
Linux / Windows | ✅ | BYOL |
- ✅: Out of the box support with standard installation.
- BYOL: Bring Your Own Library, see Acceleration for more information.
Due to the complexity of the different acceleration methods for different devices. You need to compile the libwhisper.a
or libwhisper.so
from whisper.cpp yourself.
And then set the BYOL
(Bring Your Own Library) environment variable to the path of the compiled library.
BYOL='/path/to/libwhisper.a' npm i smart-whisper
You may need to link other libraries like:
BYOL='/path/to/libwhisper.a -lopenblas' npm i smart-whisper
For Linux and Windows without GPU, the best acceleration method might be using OpenBLAS. After installing OpenBLAS, you can compile the libwhisper.a
with the following command:
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
WHISPER_OPENBLAS=1 make -j
Check out the whisper.cpp repository for more information.
The documentation is available at https://jacoblincool.github.io/smart-whisper/.
See examples for more examples.
import { Whisper } from "smart-whisper";
import { decode } from "node-wav";
import fs from "node:fs";
const model = process.argv[2];
const wav = process.argv[3];
const whisper = new Whisper(model, { gpu: true });
const pcm = read_wav(wav);
const task = await whisper.transcribe(pcm, { language: "auto" });
console.log(await task.result);
await whisper.free();
console.log("Maunally freed");
function read_wav(file: string): Float32Array {
const { sampleRate, channelData } = decode(fs.readFileSync(file));
if (sampleRate !== 16000) {
throw new Error(`Invalid sample rate: ${sampleRate}`);
}
if (channelData.length !== 1) {
throw new Error(`Invalid channel count: ${channelData.length}`);
}
return channelData[0];
}
The transcribe method returns a task object that can be used to retrieve the result of the transcription, which also emits events for the progress of the transcription.
const task = await whisper.transcribe(pcm, { language: "auto" });
task.on("transcribed", (result) => {
console.log("Transcribed", result);
});
console.log(await task.result);