ChetanXpro / nodejs-whisper

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.
https://npmjs.com/nodejs-whisper
MIT License
93 stars 22 forks source link

[Bug] Error when fulfilled #77

Closed YozoraWolf closed 6 months ago

YozoraWolf commented 6 months ago

I am using a 16000Hz .wav file, and tried transcribing it but I get:

[Nodejs-whisper]  Executing command: ./main   -l auto -m ./models/ggml-small.bin  -f /home/wolf/develop/nodejs/okuuai/src/voice/test/1.wav  

[Nodejs-whisper] Transcribing Done!
/home/wolf/develop/nodejs/okuuai/node_modules/nodejs-whisper/dist/index.js:5
        function fulfilled(value) { try { step(generator.next(value)); } catch (e) { reject(e); } }
                                                         ^
Error: Something went wrong while executing the command.
    at /home/wolf/develop/nodejs/okuuai/node_modules/nodejs-whisper/src/index.ts:41:9
    at Generator.next (<anonymous>)
    at fulfilled (/home/wolf/develop/nodejs/okuuai/node_modules/nodejs-whisper/dist/index.js:5:58)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

I'm using it inside a TS project (running Linux Mint)

Is there anything I could do about this?

Thanks

ChetanXpro commented 6 months ago

Do you have FFmpeg installed? Also, could you show me the code where you are using it? I just want to check all the arguments.

YozoraWolf commented 6 months ago

Do you have FFmpeg installed? Also, could you show me the code where you are using it? I just want to check all the arguments.

Yes. Running latest so far

ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.4.0-1ubuntu1~22.04)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda-12.4/include --extra-ldflags=-L/usr/local/cuda-12.4/lib64 --extra-ldflags=-L/usr/lib/x86_64-linux-gnu --disable-static --enable-shared --enable-cuda-llvm --enable-libnpp --enable-ffnvcodec --enable-gpl --enable-gnutls --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree --enable-nvenc
  libavutil      58.  2.100 / 58.  2.100
  libavcodec     60.  3.100 / 60.  3.100
  libavformat    60.  3.100 / 60.  3.100
  libavdevice    60.  1.100 / 60.  1.100
  libavfilter     9.  3.100 /  9.  3.100
  libswscale      7.  1.100 /  7.  1.100
  libswresample   4. 10.100 /  4. 10.100
  libpostproc    57.  1.100 / 57.  1.100
Hyper fast Audio and Video encoder
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Use -h to get full help or, even better, run 'man ffmpeg'

Also the code I'm using is... Really nothing special, just:

import { Logger } from '@src/logger';
import path from 'path';
import { nodewhisper } from 'nodejs-whisper'

const voicesPath = path.resolve(__dirname, 'test');

export const initVoice = async (): Promise<void> => {
    Logger.DEBUG(`Voices path: ${voicesPath}`);
    const transcript = await nodewhisper(`${voicesPath}/1.wav`, {
        modelName: "small",
        withCuda: true,
        verbose: true
    });

    Logger.INFO(`Transcript: ${transcript}`); // output: [ {start,end,speech} ]

};

I do hope this helps though :)

ChetanXpro commented 6 months ago

Thank you for providing these details. I was able to reproduce the issue, which was related to the handling of WAV files. The issue was from files with a .wav extension containing non-WAV data (such as FLAC), which led to errors due to incorrect format assumptions.

This PR addresses the problem: https://github.com/ChetanXpro/nodejs-whisper/pull/79

YozoraWolf commented 6 months ago

Thank you for providing these details. I was able to reproduce the issue, which was related to the handling of WAV files. The issue was from files with a .wav extension containing non-WAV data (such as FLAC), which led to errors due to incorrect format assumptions.

This PR addresses the problem: #79

Thank you for your support!

Will check briefly! :)

YozoraWolf commented 6 months ago

Thank you for providing these details. I was able to reproduce the issue, which was related to the handling of WAV files. The issue was from files with a .wav extension containing non-WAV data (such as FLAC), which led to errors due to incorrect format assumptions. This PR addresses the problem: #79

Thank you for your support!

Will check briefly! :)

Works great, I did have to downsample my .wav to 16kHz and then it worked. I'm using medium for model, but I want to use small, but now that I try to reuse npx nodejs-whisper download I get nothing. (This might be a new bug though)

ChetanXpro commented 6 months ago

Thank you for providing these details. I was able to reproduce the issue, which was related to the handling of WAV files. The issue was from files with a .wav extension containing non-WAV data (such as FLAC), which led to errors due to incorrect format assumptions. This PR addresses the problem: #79

Thank you for your support! Will check briefly! :)

Works great, I did have to downsample my .wav to 16kHz and then it worked. I'm using medium for model, but I want to use small, but now that I try to reuse npx nodejs-whisper download I get nothing. (This might be a new bug though)

Yeah, I am currently working on that issue. For a temporary fix, you can reinstall the package and then run the download command again.

Also, you don't need to run npx nodejs-whisper download. Instead, you can pass the model name you want to download in autoDownloadModelName. This will automatically download the model if it does not exist.

YozoraWolf commented 6 months ago

Thank you for providing these details. I was able to reproduce the issue, which was related to the handling of WAV files. The issue was from files with a .wav extension containing non-WAV data (such as FLAC), which led to errors due to incorrect format assumptions. This PR addresses the problem: #79

Thank you for your support! Will check briefly! :)

Works great, I did have to downsample my .wav to 16kHz and then it worked. I'm using medium for model, but I want to use small, but now that I try to reuse npx nodejs-whisper download I get nothing. (This might be a new bug though)

Yeah, I am currently working on that issue. For a temporary fix, you can reinstall the package and then run the download command again.

Also, you don't need to run npx nodejs-whisper download. Instead, you can pass the model name you want to download in autoDownloadModelName. This will automatically download the model if it does not exist.

Got it, I think that'll be it for this bug. Thanks for the support :+1:

ChetanXpro commented 6 months ago

You're welcome! If there's anything else you need help with in the future, feel free to reach out.