oseiskar / autosubsync

Automatically synchronize subtitles with audio using machine learning
MIT License
387 stars 35 forks source link

can this work in windows 10 python conda environment? #15

Closed mdkberry closed 1 year ago

mdkberry commented 1 year ago

I have tried adapting the train_and_test.sh to .py but still get the numpy error. Maybe windows can't run it.

import subprocess

# Execute the commands using the appropriate Python executable
python_executable = "python"
build_training_data_script = "training/build_training_data.py"
train_script = "training/train.py"
cross_validate_script = "training/cross_validate.py"

subprocess.run([python_executable, build_training_data_script], check=True)
subprocess.run([python_executable, train_script, "--compute_features"], check=True)
subprocess.run([python_executable, cross_validate_script], check=True)

the error I get:

(whisper) C:\Users\admin\Documents\Python\Whisper>train_and_test.py Traceback (most recent call last): File "C:\Users\admin\Documents\Python\Whisper\training\build_training_data.py", line 11, in from autosubsync.preprocessing import extract_sound File "C:\Users\admin\Documents\Python\Whisper\autosubsync__init__.py", line 3, in from .trained_logistic_regression import TrainedLogisticRegression File "C:\Users\admin\Documents\Python\Whisper\autosubsync\trained_logistic_regression.py", line 1, in import numpy as np ModuleNotFoundError: No module named 'numpy' Traceback (most recent call last): File "C:\Users\admin\Documents\Python\Whisper\train_and_test.py", line 9, in subprocess.run([python_executable, build_training_data_script], check=True) File "C:\Python311\Lib\subprocess.py", line 569, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['python', 'training/build_training_data.py']' returned non-zero exit status 1.

numpy is version 1.24.3 python 3.9.16

oseiskar commented 1 year ago

Hi. I don't see any reason why the software could not be adapted to work in that environment since FFmpeg and Python can both be configured to work on Windows. However, I unfortunately do not have the capacity to support this in the near future so you are practically on your own here.

mdkberry commented 1 year ago

I've managed to get it almost working. I think the issue I now have is more generic, and will open a seperate ticket for that. But for anyone coming to this wanting to run it on Windows in Anaconda environment. The above python script works fine inside conda environment.

python train_and_test.py worked this morning (I might have run it wrong late last night.)

I then needed to clean up the folder path for the 'sources.csv' files, it didn't like some characters.

I needed to install all the items listed in the requirements.txt and it needed "scikit-learn" not sklearn

It now starts the process of training (but errors on the cross-validation part, which I will log seperately)

oseiskar commented 1 year ago

I needed to install all the items listed in the requirements.txt and it needed "scikit-learn" not sklearn

Thank you for notifying about this. It was already fixed in setup.py in https://github.com/oseiskar/autosubsync/commit/94f60b98cd231ee1a4362cf133d344d9eddefaf4, but not in requirements.txt. Fixed in the latest master branch commit