ariym / whisper-node

Node.js bindings for OpenAI's Whisper. (C++ CPU version by ggerganov)
https://npmjs.com/whisper-node
MIT License
233 stars 40 forks source link

Is there a way to use this with a regular .js file rather than this 'import' typescript style #40

Open lando2319 opened 8 months ago

lando2319 commented 8 months ago

I'm on nodejs 20. I don't want to use typescript, but I do want to use whisper-node in javascript

the import statement of import whisper from 'whisper-node'; returns Cannot use import statement outside of a module when I add it to my .js file.

is there a way to just simply declare this dependency at the top with a simple require statement like I do for all my other modules, that way I'm not declaring every function.

What do you suggest, I'd love to start using this

I'd love to be able to just simple add the dependency statement at the top and do my thing, create functions and call the functions of the dependency. Is that possible?

I'm looking to do something like this

const whisper = require('whisper-node');

(async () => {
    try {
        const transcription = await whisper.whisper("./audio/output.wav");

        console.log(transcription);
        process.exit(0);
    } catch (err) {
        console.log("ERROR", err);
        process.exit(1);
    }

})();
lando2319 commented 8 months ago

I ran the command, npx whisper-node download

Then when running the code from the original post I get this:

[whisper-node] Transcribing: ./audio/output.wav 

[whisper-node] No 'modelName' or 'modelPath' provided. Trying default model: base.en 

[whisper-node] Problem: whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2 (base)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: n_langs       = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: AMD Radeon Pro 570
ggml_metal_init: picking default device: AMD Radeon Pro 570
ggml_metal_init: default.metallib not found, loading from source

undefined

This is based on code in my original post.

Any suggestions would be well appreciated, I'm trying to run this from a normal node js file with the normal, node index.js command. I have my wave file, I ran the command, not sure where to go from here.

I'm on the old iMac intel 3.4 GHz Quad-Core Intel Core i5, that might be why

chris-cornwall commented 4 months ago

I'm trying to use this package in the same way and I'm facing a similar issue:

 [whisper-node] No 'modelName' or 'modelPath' provided. Trying default model: base.en 
 [whisper-node] Problem: TypeError: Cannot read properties of null (reading 'shift')
    at parseTranscript (.../node_modules/whisper-node/dist/tsToArray.js:7:11)
    at .../node_modules/whisper-node/dist/index.js:36:57
    at Generator.next (<anonymous>)
    at fulfilled (.../node_modules/whisper-node/dist/index.js:5:58)

Have you had any luck with this @lando2319?

chris-cornwall commented 4 months ago

My issue was caused by not following the instructions in the ReadMe. After converting my audio file to the correct format using ffmpeg -i input.mp3 -ar 16000 output.wav, it worked perfectly.