Open Zahrun opened 3 months ago
I suggest following the XDG_Base_Directory freedesktop specification https://wiki.archlinux.org/title/XDG_Base_Directory
Hi, will have to have a look at that; did not throw an error on our two linux test machines.
As for the reasoning, the models used to be in the venv, resulting in +10GB download sizes for the gui installers, in turn resulting in user complains.
With atrain_core (and the upcoming new gui version to be based on it) we introduce downloading and removing models on the user side; in the process of doing so, we moved them to the same folder as the transcriptions, mainly having gui users in mind that then can find atrain downloads and transcriptions in the same place.
But we will have a look at the link you posted and rethink the model location.
Can I manually specify a different directory? On my system, ~/Documents is mapped to a different type of device, so I'm thinking that it might be the source of the issue.
You may be interested to have a short read on the FHS https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard The file-hierarchy man page has a section on the home directory https://www.freedesktop.org/software/systemd/man/latest/file-hierarchy.html#Home%20Directory
Can I manually specify a different directory? On my system, ~/Documents is mapped to a different type of device, so I'm thinking that it might be the source of the issue.
I think it should be sufficient to change the definition of model_path in get_model() -- line 46 -- and delete_model() -- line 54 -- in load_resources.py, as transcribe.py gets it from there.
@Zahrun can you try installing the branch Frontend_Connection should fix both the numpy issue as well as the model location in linux, but i dont have a linux testsystem at hand right now
That created a ~ directory under the current execution directory. So in my case that created /home/aroun/UnixSync/Applications/~/.local/share/aTrain/models/large-v3/model.bin
instead of going to the absolute ~.
The lock file issue seems to have disappeared. Now I successfully transcribe on CPU.
Can’t diarize though as the path for that model has also changed FileNotFoundError: [Errno 2] No such file or directory: '/home/aroun/Documents/aTrain/models/diarize/config.yaml'
For GPU, I had "Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory" although the file exists at "/home/aroun/UnixSync/Applications/atrain_core_venv/lib64/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_ops_infer.so.8".
Solved that with sudo apt install nvidia-cudnn
Now it does not complain but I get nothing transcribed
torchvision is not available - cannot save figures
Running aTrain_core
Preparing transcription
Created directory at /home/aroun/Documents/aTrain/transcriptions/2024-06-20 00-27-11 Feedb
Time taken from start to init model 5.540017127990723
Time taken from start until getting the transcription segments and info via transcription_model.transcribe 15.601551055908203
Transcription segments of faster whisper 0
Total steps without diarization: 0
Progress 0/0
Transcribing with Whisper: 100%|███████████████████▉| 113.8986875/113.9 [00:08<00:00, 13.95 audio seconds/s]
Finishing up
Thank you for using aTrain
If you use aTrain in a scientific publication, please cite our paper:
'Take the aTrain. Introducing an interface for the Accessible Transcription of Interviews'
available under: https://www.sciencedirect.com/science/article/pii/S2214635024000066
I have CUDA 11.5.1 installed, is it too old?
This will direct to the correct directory.
model_path = os.path.join("~/.local/share/aTrain", "models", model)
model_path = os.path.expanduser(model_path)
https://docs.python.org/3/library/os.path.html#os.path.expanduser
For diarize, the path can be changed at https://github.com/JuergenFleiss/atrain_core/blob/b5269d2c071003f8e06a053a747bf9edece5de2b/aTrain_core/transcribe.py#L28
as config_yml = os.path.join(os.path.expanduser("~"), ".local/share/aTrain", "models", "diarize", "config.yaml")
Why even is it defaulting to "~/Documents/aTrain/"? That folder is meant to store Documents, as the name suggests. aTrain data should probably be kept either in the venv installation folder or in ~/.local/share/aTrain