Closed kllgjc closed 1 day ago
Use the model path instead of the model name
I'm stupid and don't actually know how any of this works, I can just make it work. Can you give me one example please? <3
can you show me how are you using it?
im just running this
import os
import subprocess
from pathlib import Path
def process_files_in_directory():
# Prompt user for the directory path
folder = input("Enter the path to the directory containing audio files: ").strip()
folder_path = Path(folder)
if not folder_path.is_dir():
print(f"Error: {folder} is not a valid directory.")
return
# Supported audio file extensions
supported_extensions = [".mp3", ".wav", ".flac", ".m4a", ".mp4"]
audio_files = [file for file in folder_path.iterdir() if file.suffix.lower() in supported_extensions]
if not audio_files:
print("No supported audio files found in the directory.")
return
# Create "transcriptions" subfolder
output_folder = folder_path / "transcriptions"
output_folder.mkdir(exist_ok=True)
print(f"Processing {len(audio_files)} audio file(s)...")
for i, audio_file in enumerate(audio_files, 1):
print(f"\n[{i}/{len(audio_files)}] Processing file: {audio_file.name}")
# Construct command for processing each file
command = [
"python", "diarize.py",
"--audio", str(audio_file),
"--whisper-model", "large-v3",
"--batch-size", "32",
"--language", "en",
"--suppress_numerals",
"--device", "cuda",
]
# Execute the command
subprocess.run(command)
# Move output files (e.g., .txt and .srt) to the "transcriptions" folder
output_files = list(folder_path.glob(f"{audio_file.stem}.*"))
moved_files = []
for output_file in output_files:
if output_file.suffix in [".txt", ".srt"]:
destination = output_folder / output_file.name
output_file.rename(destination)
moved_files.append(destination)
# Intermediate update
print(f"File processed: {audio_file.name}")
print(f"Output files saved:")
for file in moved_files:
print(f" - {file}")
print(f"\nProcessing completed. Transcriptions saved in: {output_folder}")
if __name__ == "__main__":
process_files_in_directory()
# Construct command for processing each file
command = [
"python", "diarize.py",
"--audio", str(audio_file),
- "--whisper-model", "large-v3",
+ "--whisper-model", "local_model_path",
"--batch-size", "32",
"--language", "en",
"--suppress_numerals",
"--device", "cuda",
]
Thanks, I got it working before but also had to deal with where the nemo models and other things were (thanks chatgpt), just hardcoded the location into the diarize, nemo process, and helpers.py files. don't know if this is the best way to do it, but thats how i did it! Is this what the .yaml files are for though? i just typed "local" into the code search and those popped up? again, I'm dumb but just trying to figure this out! it worked really well, especially after processing it with claude with this prompt (would be better with API so i don't have to keep saying "next" haha!) could possibly integrate a small LLM to do this locally too!
"I have attached an audio transcription output with diarization generated using OpenAI's Whisper model. I need your help refining the transcription to ensure it is as accurate and professional as possible. These transcriptions are from in-house presentations at my company, [Company Name}, [Company Description]. The title of the transcription file corresponds to the presentation's title, so please use that context to infer and maintain relevance while editing.
Review and improve the transcription, focusing on:
It is critical that this work is done to a high standard, as I have taken responsibility for these transcriptions and must deliver polished, accurate results. If I do an excellent job, I could earn a significant bonus, and I'll share part of it with you for your effort!
Only provide the refined transcription output as requested, as I will be copying and pasting it into a word document in it's entirety. Be comprehensive, don't skip out on anything!
Again, only provide the transcription, and nothing else! Use as many tokens as you can in each output, I will reply with "next" and you can pickup where you left off."
Thanks again for your help and for creating this repo!
I am trying to process multiple files using a script but unfortunately my internet security keeps turning on and blocks huggingface and other downloads, even though the models are already downloaded. how do i point to these already downloaded models?