MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.43k stars 288 forks source link

python version it best works in ?????? #182

Open gprithvi369 opened 5 months ago

gprithvi369 commented 5 months ago

which python version has been used in this project , please help

MahmoudAshraf97 commented 5 months ago

I'm using 3.10.12

On Tue, Apr 30, 2024, 6:04 PM gprithvi369 @.***> wrote:

which python version has been used in this project , please help

— Reply to this email directly, view it on GitHub https://github.com/MahmoudAshraf97/whisper-diarization/issues/182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHXHGLGHKJDCCRQNVRPAX33Y76XG7AVCNFSM6AAAAABHAORO4KVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI3TCNZVGAYDKNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

sbultmann commented 2 months ago

I am using 3.10.12 but always get this error message:

--- Logging error ---
Traceback (most recent call last):
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/logging/__init__.py", line 1100, in emit
    msg = self.format(record)
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/logging/__init__.py", line 943, in format
    return fmt.format(record)
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/logging/__init__.py", line 678, in format
    record.message = record.getMessage()
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/logging/__init__.py", line 368, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "/home/bultmanns/Documents/whisper-diarization/diarize.py", line 201, in <module>
    labled_words = punct_model.predict(words_list, chunk_size=230)
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/site-packages/deepmultilingualpunctuation/punctuationmodel.py", line 47, in predict
    result = self.pipe(text)
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/site-packages/transformers/pipelines/token_classification.py", line 248, in __call__
    return super().__call__(inputs, **kwargs)
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1167, in __call__
    logger.warning_once(
  File "/home/bultmanns/anaconda3/envs/whisper-dia/lib/python3.10/site-packages/transformers/utils/logging.py", line 329, in warning_once
    self.warning(*args, **kwargs)
Message: 'You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset'
Arguments: (<class 'UserWarning'>,)
MahmoudAshraf97 commented 2 months ago

Nemo doesn't support 3.12 so far, but this error seems related to HF transformers, I guess it'll work if you comment out the punctuation part

sbultmann commented 2 months ago

I see, I am using python 3.10 though not 3.12

MahmoudAshraf97 commented 2 months ago

Then make sure that you are using compatible versions of transformers and logging

sbultmann commented 2 months ago

It was in deed the transformer version. With transformers==4.38.2 everything works! Thanks for the help!