bklynhlth / openwillis

Python library for digital measurement of health
Other
16 stars 8 forks source link

Dependency conflicts and errors #110

Closed reemTamimi closed 3 months ago

reemTamimi commented 3 months ago

I would like to use the WhisperX speech transcription for the purpose of speaker recognition (our audio files are interviews, and I would like to parse the interviewer from the interviewee), to then implement other features. However, I am encountering some dependency conflicts, and essentially not being able to run a basic command similar to what is on the github repository.

Once I follow all of the steps on your getting started page, I turn to install WhisperX from their linked repository. WhisperX seems to successfully install, however, it installs pyannote.audio-3.1.1, where openwillis requires pyannote.audio-3.0.0, ultimately giving me an error because of that.

Also, when I try to run my command, I get a compute type error. WhisperX alone allows for the definition of a compute type, so that it can be changed. I don’t see that option within OpenWillis.

GeorgeEfstathiadis commented 3 months ago

Hi Reem, could you provide some more details this issue so that I am able to reproduce it.

  1. What kind of device/os are you running this on?
  2. Can you provide a screenshot of the exact code that's causing this error plus the error message itself?

I think that will help us solve this issue for you faster! The dependency conflict shouldn't be the root of this error here, so it's probably something else.

reemTamimi commented 3 months ago

I am running this on a iMac desktop, running Sonoma 14.0. I have attached the starter code below. The harvard.wav file is taken from an open Kaggle dataset.

code

Below is the error I receive upon running the code

code_error

Below is the error I got when I tried installing WhisperX, based on their documentation, after I installed openwillis in a conda environment

dependency_error

I want to mention that it works if I try the Vosk method, however, Vosk unfortunately has limited options.

GeorgeEfstathiadis commented 3 months ago

The issue seems to be with your device's memory, since it is running in CPU. We are adding now the option to edit compute_type and batch_size for WhisperX transcription, so you can change either of them to run it on your device. Reducing either of the parameters should fix the issue, but sacrificing performance or computation speed respectively.