chidiwilliams / buzz

Buzz transcribes and translates audio offline on your personal computer. Powered by OpenAI's Whisper.
https://chidiwilliams.github.io/buzz
MIT License
12.64k stars 948 forks source link

How to get access to internal transcript database #984

Closed gd-three closed 6 days ago

gd-three commented 2 weeks ago

I want to do speech recognition through the program API, Is there a way or method to do it now? (anxious)

raivisdejus commented 2 weeks ago

@gd-three No Buzz does not have this feature. If you need an api see this discussion on some APIs available in the cloud and some that you can run on your computer

gd-three commented 2 weeks ago

@gd-three No Buzz does not have this feature. If you need an api see this discussion on some APIs available in the cloud and some that you can run on your computer

Now, I use a string in my program (buzz add --task transcribe --language zh --model-type whisper --model-size small --txt file path)CMD to perform audio recognition, which starts a new buzz process each time, which is inefficient

raivisdejus commented 2 weeks ago

@gd-three Please explain what is inefficient and what you would like to be different. If you explain more about your process, I may be able to find a more efficient solution. Where do the audio files come from and how you use the transcription results. Is it something automated?

If the transcription speed is slow, you will need a more powerful hardware or you will need to send your transcription to some API.

This should transcribe using OpenAI API, you will need a API token. Prices according to their price list.

buzz add --task transcribe --language zh --model-type openaiapi --openai-token ABC123DEF456 --txt file path
gd-three commented 2 weeks ago

@gd-three Please explain what is inefficient and what you would like to be different. If you explain more about your process, I may be able to find a more efficient solution. Where do the audio files come from and how you use the transcription results. Is it something automated?

If the transcription speed is slow, you will need a more powerful hardware or you will need to send your transcription to some API.

This should transcribe using OpenAI API, you will need a API token. Prices according to their price list.

buzz add --task transcribe --language zh --model-type openaiapi --openai-token ABC123DEF456 --txt file path

I want to get the results of real-time speech recognition in the program I wrote myself, and then do some things of my own, but the recognition results now only have UI display, there is no storage path, if buzz can provide the path, it is great, it will solve my problem.

raivisdejus commented 2 weeks ago

@gd-three Buzz stores all transcription results in a SQLite database. Location of the database is printed out in the logs. To see the logs please see this section https://github.com/chidiwilliams/buzz/blob/main/CONTRIBUTING.md#troubleshooting

On Windows you can paste this %USERPROFILE%\AppData\Local\Buzz\Buzz in the address par of your file manager and get to the location Buzz stores it's internal data.

gd-three commented 2 weeks ago

@gd-three Buzz stores all transcription results in a SQLite database. Location of the database is printed out in the logs. To see the logs please see this section https://github.com/chidiwilliams/buzz/blob/main/CONTRIBUTING.md#troubleshooting

On Windows you can paste this %USERPROFILE%\AppData\Local\Buzz\Buzz in the address par of your file manager and get to the location Buzz stores it's internal data.

ok, I know this directory before, I use version 0.8 so I didn't see what you said

gd-three commented 2 weeks ago

I am currently downloading version 1.1 in China very slowly, and only found version 1.0, which has a flash back in real-time recognition. log=========================================== [2024-11-09 17:58:35,558] locale.:12 DEBUG -> UI locales ['zh-Hans-CN', 'zh-CN', 'zh', 'en-US', 'en-Latn-US', 'en'] [2024-11-09 17:58:35,561] model_loader.:40 DEBUG -> Model root directory: C:\Users\gd\AppData\Local\Buzz\Buzz\Cache\models [2024-11-09 17:58:36,265] utils._find_ffmpeg_extension:114 DEBUG -> Loading FFmpeg6 [2024-11-09 17:58:36,266] utils._find_ffmpeg_extension:120 DEBUG -> Failed to load FFmpeg6 extension. Traceback (most recent call last): File "torio_extension\utils.py", line 116, in _find_ffmpeg_extension File "torio_extension\utils.py", line 105, in _find_versionsed_ffmpeg_extension File "", line 95, in find_spec ModuleNotFoundError: No module named 'torio.lib' [2024-11-09 17:58:36,266] utils._find_ffmpeg_extension:114 DEBUG -> Loading FFmpeg5 [2024-11-09 17:58:36,266] utils._find_ffmpeg_extension:120 DEBUG -> Failed to load FFmpeg5 extension. Traceback (most recent call last): File "torio_extension\utils.py", line 116, in _find_ffmpeg_extension File "torio_extension\utils.py", line 105, in _find_versionsed_ffmpeg_extension File "", line 95, in find_spec ModuleNotFoundError: No module named 'torio.lib' [2024-11-09 17:58:36,266] utils._find_ffmpeg_extension:114 DEBUG -> Loading FFmpeg4 [2024-11-09 17:58:36,266] utils._find_ffmpeg_extension:120 DEBUG -> Failed to load FFmpeg4 extension. Traceback (most recent call last): File "torio_extension\utils.py", line 116, in _find_ffmpeg_extension File "torio_extension\utils.py", line 105, in _find_versionsed_ffmpeg_extension File "", line 95, in find_spec ModuleNotFoundError: No module named 'torio.lib' [2024-11-09 17:58:36,267] utils._find_ffmpeg_extension:114 DEBUG -> Loading FFmpeg [2024-11-09 17:58:36,267] utils._find_ffmpeg_extension:120 DEBUG -> Failed to load FFmpeg extension. Traceback (most recent call last): File "torio_extension\utils.py", line 116, in _find_ffmpeg_extension File "torio_extension\utils.py", line 105, in _find_versionsed_ffmpeg_extension File "", line 95, in find_spec ModuleNotFoundError: No module named 'torio.lib' [2024-11-09 17:58:36,335] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:36,342] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:36,342] buzz.main:60 DEBUG -> app_dir: D:\software\Buzz_internal [2024-11-09 17:58:36,342] buzz.main:61 DEBUG -> log_dir: C:\Users\gd\AppData\Local\Buzz\Buzz\Logs [2024-11-09 17:58:36,342] buzz.main:62 DEBUG -> cache_dir: C:\Users\gd\AppData\Local\Buzz\Buzz\Cache [2024-11-09 17:58:36,342] buzz.main:63 DEBUG -> data_dir: C:\Users\gd\AppData\Local\Buzz\Buzz [2024-11-09 17:58:36,353] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:36,358] db._setup_db:38 DEBUG -> Database connection opened: C:\Users\gd\AppData\Local\Buzz\Buzz\Buzz.sqlite [2024-11-09 17:58:36,364] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:36,381] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:36,425] file_transcriber_queue_worker.run:40 DEBUG -> Waiting for next transcription task [2024-11-09 17:58:38,342] settings.init:14 DEBUG -> Settings filename: \HKEY_CURRENT_USER\Software\Buzz\OrganizationDefaults [2024-11-09 17:58:38,369] backend._load_plugins:205 DEBUG -> Loading KWallet [2024-11-09 17:58:38,369] backend._load_plugins:205 DEBUG -> Loading SecretService [2024-11-09 17:58:38,369] backend._load_plugins:205 DEBUG -> Loading Windows [2024-11-09 17:58:38,371] init.:11 DEBUG -> Loaded cffi backend [2024-11-09 17:58:38,390] backend._load_plugins:205 DEBUG -> Loading chainer [2024-11-09 17:58:38,391] backend._load_plugins:205 DEBUG -> Loading libsecret [2024-11-09 17:58:38,391] backend._load_plugins:205 DEBUG -> Loading macOS log========================================== Above is the full log of a startup until the flash back

gd-three commented 2 weeks ago

Is there an address in China where i can download it quickly? Or does version 1.1 still have this bug?

gd-three commented 2 weeks ago

Is there an address in China where i can download it quickly? Or does version 1.1 still have this bug?

raivisdejus commented 2 weeks ago

@gd-three If you were using the old version less than 1.0 before and then upgraded to some latest version you may get a crash on startup. Fix for this bug is to delete the old recording history (everything in the directory mentioned above) OR use the latest development version 1.2.0 from latest builds from here https://github.com/chidiwilliams/buzz/actions/workflows/ci.yml?query=branch%3Amain

Select the latest build, scroll down to the artifacts section and download the installation file. You need to be logged in the Github to download link. This download will be slow unfortunately.

gd-three commented 2 weeks ago

I delete the old recording history and then solve the problem by reinstalling it,thank you for your patience, for you answer.

gd-three commented 2 weeks ago

still, Perferences---->Folder Watch,What is this feature?Does it periodically monitor folders for transcribe?

gd-three commented 2 weeks ago

still, Perferences---->Folder Watch,What is this feature?Does it periodically monitor folders for transcribe?

I found a bug. When there were multiple audio files in the input directory, buzz would create multiple duplicate transcription tasks. After buzz transcribed an audio file, it would move it to the output directory, resulting in subsequent duplicate tasks failing to find the file during transcription.Then lead to an error

gd-three commented 2 weeks ago

@gd-three If you were using the old version less than 1.0 before and then upgraded to some latest version you may get a crash on startup. Fix for this bug is to delete the old recording history (everything in the directory mentioned above) OR use the latest development version 1.2.0 from latest builds from here https://github.com/chidiwilliams/buzz/actions/workflows/ci.yml?query=branch%3Amain

Select the latest build, scroll down to the artifacts section and download the installation file. You need to be logged in the Github to download link. This download will be slow unfortunately.

please see above 👆

raivisdejus commented 1 week ago

@gd-three I am unable to replicate this on latest development version 1.2.0. There was such bug but it was fixed.

Can you please ensure you are using 1.2.0 and provide detailed steps on how to replicate.

Test 1:

  1. Have a separate input and output folders
  2. Have 3 files in input folder
  3. Enable folder watch in preferences

This worked, files from input folder were processed.

Test 2:

  1. Have a separate input and output folders
  2. Turn off Buzz
  3. Add files to input folder
  4. Start buzz

This also worked, all files from input folder were processed, no duplicates and no errors.

gd-three commented 6 days ago

@gd-three I am unable to replicate this on latest development version 1.2.0. There was such bug but it was fixed.

Can you please ensure you are using 1.2.0 and provide detailed steps on how to replicate.

Test 1:

  1. Have a separate input and output folders
  2. Have 3 files in input folder
  3. Enable folder watch in preferences

This worked, files from input folder were processed.

Test 2:

  1. Have a separate input and output folders
  2. Turn off Buzz
  3. Add files to input folder
  4. Start buzz

This also worked, all files from input folder were processed, no duplicates and no errors.

In that case, it's okay, I'm using version 1.0.1, not developer version 1.2. Ignore this problem.