rishikanthc / Scriberr

Self-hosted AI audio transcription
MIT License
232 stars 7 forks source link

Auth Error #3

Closed VilterPD closed 1 day ago

VilterPD commented 1 day ago

Hi, awesome project.

I'm having trouble getting it to get the actual transcription to work:

Auth state after loading cookie: false

Auth token: 

admin email:  admin@example.com

admin password:  password

Auth refresh failed trying to login

Auth state after loading cookie: false
Auth token: 

Auth refresh failed trying to login

admin email:  admin@example.com

admin password:  password

Auth state after loading cookie: true

Auth token: *redacted*

Repeating on loop

I tried setting

  - POCKETBASE_ADMIN_EMAIL=admin@example.com
  - POCKETBASE_ADMIN_PASSWORD=password

To different values aswell, no luck.

my compose is

services: scriberr: image: ghcr.io/rishikanthc/scriberr:beta container_name: scriberr ports:

volumes: db: files:

VilterPD commented 1 day ago

It seems to have created a job, but its not running:

j6AhNvHfiIKuezs

from upload file File {

size: 6145431,

type: 'audio/mpeg',

name: 'Testdatei.mp3',

lastModified: 1728136306304

}

Created job: 1

Screenshot 2024-10-05 at 15-56-28

dunecokc commented 1 day ago

Similar, I got this setup in Unraid docker, and I upload the audio and its just sitting there, showing No active jobs.

der-robert commented 1 day ago

Same here. Cant even play the audio file.

Uncaught (in promise) DOMException: The media resource indicated by the src attribute or assigned media provider object was not suitable.

rishikanthc commented 1 day ago

Can you navigate to http://localhost:9243/admin/queues ? That will open the job queue dashboard which will have access to the logs. If you can show me the logs that would be of great help to understand what's going wrong.

rishikanthc commented 1 day ago

That's strange. Did you use port 9243 ?

VilterPD commented 1 day ago

Sorry, I changed the ports, that was stupid. Here you go:

Screenshot 2024-10-05 at 21-26-29 Bull Dashboard Screenshot 2024-10-05 at 21-26-45 Bull Dashboard

rishikanthc commented 1 day ago

Awesome. Could you post the messages from the logs and error tabs ?

VilterPD commented 1 day ago

Sure, thanks for looking at it. I love the idea of this project.

Log:


1. Starting job 1 for record jb6h7wayyuuzl55

2. Fetched record for jb6h7wayyuuzl55

Nothing else

Error:

38 milliseconds
Failed at
21:26:50

processAudioattempt #2

Error: ENOENT: no such file or directory, open '/scriberr/audio/jb6h7wayyuuzl55.mp3'
    at Object.openSync (node:fs:561:18)
    at Object.writeFileSync (node:fs:2358:35)
    at Worker.connection.host [as processFn] (file:///app/build/server/chunks/queue-C2m1Jwu2.js:43894:8)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async Worker.processJob (/app/node_modules/bullmq/dist/cjs/classes/worker.js:455:28)
    at async Worker.retryIfFailed (/app/node_modules/bullmq/dist/cjs/classes/worker.js:640:24)
Error: ENOENT: no such file or directory, open '/scriberr/audio/jb6h7wayyuuzl55.mp3'
    at Object.openSync (node:fs:561:18)
    at Object.writeFileSync (node:fs:2358:35)
    at Worker.connection.host [as processFn] (file:///app/build/server/chunks/queue-C2m1Jwu2.js:43894:8)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async Worker.processJob (/app/node_modules/bullmq/dist/cjs/classes/worker.js:455:28)
    at async Worker.retryIfFailed (/app/node_modules/bullmq/dist/cjs/classes/worker.js:640:24)

I just checked, I have persistent data saved in a volume. It is, indeed, empty. I usually use volumes, because i have them in my backup strategy. Should I try setting it up in a mounted directory?

Heres my compose:

services:
  scriberr:
    image: ghcr.io/rishikanthc/scriberr:beta
    container_name: scriberr
    ports:
      - "3002:3000"
      - "3003:8080" #Optionally expose DB UI
      - "3004:9243" #Optionally expose JobQueue UI
    environment:
      - OPENAI_API_KEY=redacted
      - POCKETBASE_ADMIN_EMAIL=admin@example.com
      - POCKETBASE_ADMIN_PASSWORD=password
      - REDIS_HOST=127.0.0.1
      - REDIS_PORT=6379
      - SCRIBO_FILES=/scriberr
    volumes:
      -  db:/app/db
      -  files:/scriberr

volumes:
  db:
  files:
rishikanthc commented 1 day ago

In the folder you are mapping to /scriberr, can you create 2 sub-folders called audio and transcripts ? Then try again ? So for eg. if you're mapping files:/scriberr then inside files create these sub directories. Let me know what you see after doing this.

VilterPD commented 1 day ago

After creating the directories, it seems to be transcribing: Screenshot 2024-10-05 at 21-44-51

But its going into a strange loop, trying to overwrite the generated .wav:

`r: whisper_model_load: model size = 487.00 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: system_info: n_threads = 100 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | CANN = 0

main: processing '/scriberr/audio/jb6h7wayyuuzl55-ffmpeg.wav' (4574891 samples, 285.9 sec), 10 threads, 10 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

stderr: whisper_init_state: compute buffer (decode) = 97.27 MB

stderr: whisper_init_state: kv self size = 56.62 MB

stderr: whisper_init_state: kv cross size = 56.62 MB

stderr: whisper_init_state: kv pad size = 4.72 MB

stderr: whisper_init_state: compute buffer (conv) = 22.41 MB

stderr: whisper_init_state: compute buffer (encode) = 280.07 MB

stderr: whisper_init_state: compute buffer (cross) = 6.18 MB

Starting job 1 for record jb6h7wayyuuzl55

Fetched record for jb6h7wayyuuzl55

Downloaded and saved audio file for record jb6h7wayyuuzl55

stderr: Input file: /scriberr/audio/jb6h7wayyuuzl55.mp3

stderr: Format: Audio MPEG layer III stream Bit rate: 32000 kbit/s CRC: no Mode: joint (MS/intensity) stereo Emphasis: no Sample rate: 48000 Hz Encoding delay: 1105 Padding: 303 Generating waveform data... Samples per pixel: 256 Input channels: 2 Output channels: 1

Done: 0%

stderr: Done: 1%

stderr: Done: 2%

stderr: Done: 3%

stderr: Done: 4%

stderr: Done: 5%

stderr: Done: 6%

stderr: Done: 7%

stderr: Done: 8%

stderr: Done: 9%

stderr: Done: 10%

stderr: Done: 11%

stderr: Done: 12%

stderr: Done: 13%

stderr: Done: 14%

stderr: Done: 15%

stderr: Done: 16%

stderr: Done: 17%

stderr: Done: 18%

stderr: Done: 19%

stderr: Done: 20%

stderr: Done: 21%

stderr: Done: 22%

stderr: Done: 23%

stderr: Done: 24%

stderr: Done: 25%

stderr: Done: 26%

stderr: Done: 27%

stderr: Done: 28%

stderr: Done: 29%

stderr: Done: 30%

stderr: Done: 31%

stderr: Done: 32%

stderr: Done: 33%

stderr: Done: 34%

stderr: Done: 35%

stderr: Done: 36%

stderr: Done: 37%

stderr: Done: 38%

stderr: Done: 39%

stderr: Done: 40%

stderr: Done: 41%

stderr: Done: 42%

stderr: Done: 43%

stderr: Done: 44%

stderr: Done: 45%

stderr: Done: 46%

stderr: Done: 47%

stderr: Done: 48%

stderr: Done: 49%

stderr: Done: 50%

stderr: Done: 51%

stderr: Done: 52%

stderr: Done: 53%

stderr: Done: 54%

stderr: Done: 55%

stderr: Done: 56%

stderr: Done: 57%

stderr: Done: 58%

stderr: Done: 59%

stderr: Done: 60%

stderr: Done: 61%

stderr: Done: 62%

stderr: Done: 63%

stderr: Done: 64%

stderr: Done: 65%

stderr: Done: 66%

stderr: Done: 67%

stderr: Done: 68%

stderr: Done: 69%

stderr: Done: 70%

stderr: Done: 71%

stderr: Done: 72%

stderr: Done: 73%

stderr: Done: 74%

stderr: Done: 75%

stderr: Done: 76%

stderr: Done: 77%

stderr: Done: 78%

stderr: Done: 79%

stderr: Done: 80%

stderr: Done: 81%

stderr: Done: 82%

stderr: Done: 83%

stderr: Done: 84%

stderr: Done: 85%

stderr: Done: 86%

stderr: Done: 87%

stderr: Done: 88%

stderr: Done: 89%

stderr: Done: 90%

stderr: Done: 91%

stderr: Done: 92%

stderr: Done: 93%

stderr: Done: 94%

stderr: Done: 95%

stderr: Done: 96%

stderr: Done: 97%

stderr: Done: 98%

stderr: Done: 99%

stderr: Done: 100%

stderr: Frames decoded: 11915 (4:45.960) Generated 53614 points

stderr: Output file: /scriberr/audio/jb6h7wayyuuzl55.mp3.json

stderr: Done

Audiowaveform for jb6h7wayyuuzl55 generated

stderr: ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309 configuration: --prefix=/usr --disable-librtmp --disable-lzma --disable-static --disable-stripping --enable-avfilter --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libmp3lame --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librist --enable-libsoxr --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-lto=auto --enable-lv2 --enable-openssl --enable-pic --enable-postproc --enable-pthreads --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-vulkan --optflags=-O3 --enable-libjxl --enable-libsvtav1 --enable-libvpl

stderr: libavutil 58. 29.100 / 58. 29.100 libavcodec 60. 31.102 / 60. 31.102 libavformat 60. 16.100 / 60. 16.100 libavdevice 60. 3.100 / 60. 3.100 libavfilter 9. 12.100 / 9. 12.100 libswscale 7. 5.100 / 7. 5.100 libswresample 4. 12.100 / 4. 12.100 libpostproc 57. 3.100 / 57. 3.100

stderr: Input #0, mp3, from '/scriberr/audio/jb6h7wayyuuzl55.mp3': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A isommp42 voice-memo-uuid : AC5BB196-8CC1-4097-AE24-7CBA601F586A title : Neue Aufnahme encoder : Lavf58.76.100

stderr: Duration: 00:04:45.96, start: 0.023021, bitrate: 171 kb/s Stream #0:0: Audio: mp3, 48000 Hz, stereo, fltp, 171 kb/s Metadata: encoder : Lavc58.13

stderr: File '/scriberr/audio/jb6h7wayyuuzl55-ffmpeg.wav' already exists. Overwrite? [y/N] `

rishikanthc commented 1 day ago

Yeah this is normal. There are 3 different commands running and each command converts the audio to the format it needs. The scribo_files location is really a working directory for the app. The files created during transcription are cleaned and removed at the end. So scribo_files will be empty for the most part except while transcribing. All data and files are stored in the database. Hope this helps.. If this resolves it can I close this issue ?

VilterPD commented 1 day ago

Ah, ok, makes sense. But theres no transcript yet. Its still at 0%.

rishikanthc commented 1 day ago

How long is the audio file ? Do you see progress in the job dashboard ? It should be printing out sections of the transcript being generated in the logs. Let me know what you see. I'll keep this open till you are able to see a transcript. Unfortunately I'll need your help to debug this as I need access to the logs xD

rishikanthc commented 1 day ago

When a job starts 3 things happen. First, audio waveform runs to extract peaks data from the audio. This is used for the waveform visualizer. Next, ffmpeg runs to convert it to 16-bit wav files. Last step is running whisper.cpp to run transcription. The progress % will be at 0 till transcription starts.

VilterPD commented 1 day ago

It's definetly worth it.

This one is quite long, 20 mins. I also tried one thats only 20 sec.

The log seems stuck at stderr: File '/scriberr/audio/jb6h7wayyuuzl55-ffmpeg.wav' already exists. Overwrite? [y/N]

In. a log i cant choose y/N though. Ill try a shorter one.

rishikanthc commented 1 day ago

Okay can you try cleaning the scriberr_files location and starting with empty directories and try again ? Basically wipe it and start as a clean mount

VilterPD commented 1 day ago

Ok, I cleaned it up, and it failed again, with a new file:

Log:

Starting job 1 for record 1xvg7y1gvgv2jpa

Fetched record for 1xvg7y1gvgv2jpa

Downloaded and saved audio file for record 1xvg7y1gvgv2jpa

stderr: Input file: /scriberr/audio/1xvg7y1gvgv2jpa.mp3

stderr: Format: Audio MPEG layer III stream
Bit rate: 96000 kbit/s
CRC: no
Mode: single channel
Emphasis: no
Sample rate: 22050 Hz
Encoding delay: 1105
Padding: 367
Generating waveform data...
Samples per pixel: 256
Input channels: 1
Output channels: 1

Done: 0%
Done: 40%

stderr: 
Done: 73%

stderr: 
Done: 100%

stderr: 
Frames decoded: 387 (0:10.109)
Generated 867 points

stderr: Output file: /scriberr/audio/1xvg7y1gvgv2jpa.mp3.json

stderr: Done

Audiowaveform for 1xvg7y1gvgv2jpa generated

stderr: ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.1 (Alpine 13.2.1_git20240309) 20240309
  configuration: --prefix=/usr --disable-librtmp --disable-lzma --disable-static --disable-stripping --enable-avfilter --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libmp3lame --enable-libopenmpt --enable-libopus --enable-libplacebo --enable-libpulse --enable-librav1e --enable-librist --enable-libsoxr --enable-libsrt --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-lto=auto --enable-lv2 --enable-openssl --enable-pic --enable-postproc --enable-pthreads --enable-shared --enable-vaapi --enable-vdpau --enable-version3 --enable-vulkan --optflags=-O3 --enable-libjxl --enable-libsvtav1 --enable-libvpl
  libavutil      58. 29.100 / 58. 29.100

stderr:   libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100

stderr: Input #0, mp3, from '/scriberr/audio/1xvg7y1gvgv2jpa.mp3':
  Metadata:
    encoder         : Lavf58.12.100
  Duration: 00:00:10.11, start: 0.050113, bitrate: 96 kb/s

stderr:   Stream #0:0: Audio: mp3, 22050 Hz, mono, fltp, 96 kb/s

stderr: Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help

stderr: Output #0, wav, to '/scriberr/audio/1xvg7y1gvgv2jpa-ffmpeg.wav':
  Metadata:
    ISFT            : Lavf60.16.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 16000 Hz, mono, s16, 256 kb/s

stderr: 
    Metadata:
      encoder         : Lavc60.31.102 pcm_s16le
size=       0kB time=00:00:00.00 bitrate=N/A speed=   0x    

stderr: [out#0/wav @ 0x7f4a6a09f5c0] video:0kB audio:314kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.024271%
size=     314kB time=00:00:10.04 bitrate= 256.1kbits/s speed=1.1e+03x    

Audio file for 1xvg7y1gvgv2jpa converted successfully

stderr: whisper_init_from_file_with_params_no_state: loading model from './whisper.cpp/models/ggml-small.en.bin'

stderr: whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3 (small)

stderr: whisper_model_load: adding 1607 extra tokens

stderr: whisper_model_load: n_langs       = 99

stderr: whisper_model_load:      CPU total size =   487.00 MB

stderr: whisper_model_load: model size    =  487.00 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: 
system_info: n_threads = 100 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | CUDA = 0 | COREML = 0 | OPENVINO = 0 | CANN = 0

stderr: main: processing '/scriberr/audio/1xvg7y1gvgv2jpa-ffmpeg.wav' (160683 samples, 10.0 sec), 10 threads, 10 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

stderr: whisper_init_state: compute buffer (decode) =   97.27 MB

stderr: whisper_init_state: kv self size  =   56.62 MB

stderr: whisper_init_state: kv cross size =   56.62 MB

stderr: whisper_init_state: kv pad  size  =    4.72 MB

stderr: whisper_init_state: compute buffer (conv)   =   22.41 MB

stderr: whisper_init_state: compute buffer (encode) =  280.07 MB

stderr: whisper_init_state: compute buffer (cross)  =    6.18 MB

Error:

Error: Command failed with exit code null
    at ChildProcess.<anonymous> (file:///app/build/server/chunks/queue-C2m1Jwu2.js:43875:16)
    at ChildProcess.emit (node:events:531:35)
    at maybeClose (node:internal/child_process:1104:16)
    at ChildProcess._handle.onexit (node:internal/child_process:304:5)
rishikanthc commented 1 day ago

okay looks like the model started but it got killed ... ?? Can you reduce the #threads and #processors to 1 and try ? So sorry for the back and forth mate.. Wish there was an easier way to debug. Just trying to identify what the issue is so I can fix it. I'm assuming you didn't add any resource restrictions on the container

rishikanthc commented 1 day ago

Okay yes. You have set number of threads and processors to 10 ! Please make sure your system has the required resources to offer. For now let's try by setting them both to a lower value like 2.

VilterPD commented 1 day ago

I get it, no problem. I work with software aswell 👍

That seems to have fixed it. I get a transcript. I'll try tuning it up in small steps now.

Solution: Create folders /audio and /transcripts on mountpoint of /scriberr

Finetune processor and threads

rishikanthc commented 1 day ago

Perfect ! Glad we could figure this out. Let me know if you face any issues :) Thanks for trying the app and hope you like it.