gitmylo / audio-webui

A webui for different audio related Neural Networks
MIT License
964 stars 90 forks source link

[BUG REPORT] Whisper and the app is saving in the general #232

Open AcTePuKc opened 3 weeks ago

AcTePuKc commented 3 weeks ago

Describe the bug When using the batch processing feature in the Whisper tab of the audio-webui, the application processes the files but does not move them from the temporary folder. This results in the batch processing output files remaining in the temporary directories (e.g., F:\Voice\audio-webui\data\temp and subfolders with random names) instead of being moved to the designated output location in the app's subfolders.

To Reproduce Steps to reproduce the behavior:

  1. Go to the Text to Speech tab, select Bark, and generate any text.
  2. This generates files in the %LOCALAPPDATA%\Temp.
  3. Go to the Whisper tab in the audio-webui.
  4. Use the 'Batch input' feature to upload multiple .wav files (e.g., 38 files).
  5. Start the batch processing.
  6. Observe that the files are processed and output files are generated but remain in the temporary folders instead of being moved to the designated output location.

Expected behavior The expected behavior is that after processing, the output files should be moved from the temporary folders to the specified output directory within the app's subfolders, ensuring proper file management and avoiding clutter in the temporary directories.

Screenshots image Additional context

Directory Structure Example:

F:\Voice\audio-webui\data\temp\
├── 1f6idh9fl.txt
├── 2y976ccx9.txt
├── 3fu2c7ign.txt
...
├── tmp00em15ar
├── tmp07ivbjuh
├── tmp0ge478ud
├── tmp1l5t0gmj
...
+---0351ac6be3bc7041cfef87b319d52126ab5860b8
¦       6.wav
+---04695579ce2d741c2c1d1c3fcb94bec74c6dbcc0
¦       37.wav
...

The generated text files are stored correctly in the app's subfolders, but the associated images, wav, and mp4 files are being saved in the user's temporary directories (%LOCALAPPDATA%\Temp). This discrepancy may be due to path handling in the batch processing code. Suggestion - that might clutter the console but would be ideal to show where those files are saved.

iimport os

# Function to detect the main app folder dynamically
def get_main_app_folder():
    return os.path.dirname(os.path.abspath(__file__))

# Define base paths using the detected main app folder
main_app_folder = get_main_app_folder()
base_input_path = os.path.join(main_app_folder, "data", "inputs")
base_output_path = os.path.join(main_app_folder, "data", "outputs")
base_output_path_png = os.path.join(main_app_folder, "data", "outputs", "png")
base_output_path_wav = os.path.join(main_app_folder, "data", "outputs", "wav")
base_output_path_mp4 = os.path.join(main_app_folder, "data", "outputs", "mp4")
base_output_path_txt = os.path.join(main_app_folder, "data", "outputs", "txt")

# Ensure the directories exist
os.makedirs(base_input_path, exist_ok=True)
os.makedirs(base_output_path, exist_ok=True)
os.makedirs(base_output_path_png, exist_ok=True)
os.makedirs(base_output_path_wav, exist_ok=True)
os.makedirs(base_output_path_mp4, exist_ok=True)
os.makedirs(base_output_path_txt, exist_ok=True)

# Rest of the processing code or whatever
# ...

print(f"Base input path: {base_input_path}")
print(f"Base output path: {base_output_path}")
print(f"Base output path for PNG: {base_output_path_png}")
print(f"Base output path for WAV: {base_output_path_wav}")
print(f"Base output path for MP4: {base_output_path_mp4}")
print(f"Base output path for TXT: {base_output_path_txt}")