thewh1teagle / vibe

Transcribe on your own!
https://thewh1teagle.github.io/vibe/
MIT License
1.11k stars 69 forks source link

[Bug]: CLI format/diarize issue #351

Open clem0338 opened 1 week ago

clem0338 commented 1 week ago

What happened?

I'm trying to transcribe and diarize audio from CLI using this command line:

vibe ^
 --model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\ggml-medium.bin" ^
 --diarize ^
 --diarize-vad-model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\segmentation-3.0.onnx" ^
 --diarize-speaker-id-model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\wespeaker_en_voxceleb_CAM++.onnx" ^
 --format txt ^
 --write out.txt ^
 --file "F:\Temp\MP3 Conversion\Output\6cbcc955-b0e6-45ff-8515-57636a97609d.mp3"

2 issues happen:

The same file is transcribed and diarized as expected when started from the GUI

Steps to reproduce

exactly as show in the "What happened?" step image

What OS are you seeing the problem on?

Windows 11 Vibe v2.6.3 (reported as v0.0.6 when invoked with vibe --version)

Relevant log output

G:\Vibe>vibe ^
More?  --model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\ggml-medium.bin" ^
More?  --diarize ^
More?  --diarize-vad-model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\segmentation-3.0.onnx" ^
More?  --diarize-speaker-id-model "C:\Users\Cedric\AppData\Local\github.com.thewh1teagle.vibe\wespeaker_en_voxceleb_CAM++.onnx" ^
More?  --format txt ^
More?  --write out.txt ^
More?  --file "F:\Temp\MP3 Conversion\Output\6cbcc955-b0e6-45ff-8515-57636a97609d.mp3"

G:\Vibe>Transcribe... 🔄
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: NVIDIA GeForce RTX 3070 Ti (NVIDIA) | uma: 0 | fp16: 1 | warp size: 32
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)
ggml_gallocr_reserve_n: reallocating NVIDIA GeForce RTX 3070 Ti buffer from size 0.00 MiB to 25.73 MiB
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 0.92 MiB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: cannot reallocate multi buffer graph automatically, call reserve
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 0)
ggml_gallocr_reserve_n: reallocating NVIDIA GeForce RTX 3070 Ti buffer from size 0.00 MiB to 160.77 MiB
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 0.00 MiB
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: cannot reallocate multi buffer graph automatically, call reserve
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 0)
ggml_gallocr_reserve_n: reallocating NVIDIA GeForce RTX 3070 Ti buffer from size 0.00 MiB to 5.86 MiB
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 0.00 MiB
ggml_backend_sched_alloc_splits: failed to allocate graph, reserving (backend_ids_changed = 1)
ggml_gallocr_reserve_n: reallocating NVIDIA GeForce RTX 3070 Ti buffer from size 0.00 MiB to 92.14 MiB
ggml_gallocr_reserve_n: reallocating CPU buffer from size 0.00 MiB to 0.88 MiB
{
  "processing_time_sec": 0,
  "segments": [
    {
      "start": 0,
      "stop": 206,
      "text": " Ok so let's start the"
    },
    {
      "start": 206,
      "stop": 460,
      "text": " recording. Do you care"
    },
    {
      "start": 460,
      "stop": 700,
      "text": " about how long it is?"
    },
    {
      "start": 700,
      "stop": 1323,
      "text": " How long can... should"
    },
    {
      "start": 1323,
      "stop": 1805,
      "text": " it be? 10 seconds."
    },
    {
      "start": 1805,
      "stop": 1900,
      "text": " Please stop then."
    }
  ]
}
Invalid format specified. Defaulting to SRT format.
Transcription completed in 3.2s ⏱️
Done ✅
Miltondz commented 4 days ago

I have the same Issue using the CLI