cvqluu / simple_diarizer

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
GNU General Public License v3.0
140 stars 28 forks source link

Latest Python and packages #7

Closed andrewmackie closed 2 years ago

andrewmackie commented 2 years ago

The current release prevents use of Python 3.10 and requires specific versions of Beautiful Soup and PyTube.

I've forked the repo to overcome these version limitations and it's working for me. I haven't made a pull request, however, as your repo doesn't have tests and I don't know whether there is a use case which would be broken by my changes.

Can you please remove these version limitations if they're not needed?

Thanks for the repo - it's effective and much easier to use than SpeechBrain.

cvqluu commented 2 years ago

Hi, sorry for the delayed response. Feel free to make the pull request, if the main functionality is working (as covered by the colab notebook) then I am happy to merge those changes.

It's a good point that I should add some tests however!

andrewmackie commented 2 years ago

Thanks and sorry for my delayed response.

I have created a new Google Colab notebook to test it in Python 3.10.

It works until calling diar.diarize_youtube() - in utls.parse_ttml(), BeautifulSoup attempts to load the lxml parser and fails to do so (in spite of the notebook installing lxml with pip and importing it in the python code). I don't have any more time to chase this down for you, sorry, because it took me a few hours to get Python 3.10 working in Google Colab (those tests would make contributing easier!).

I have created PR #8 for you to merge (or not) as you see fit.

cvqluu commented 2 years ago

It's okay, the diarize_youtube functionality I plan to depracate support for, so it is not an issue.

I think this repo will be more useful for people with the general functionality working with less strict python requirements anyways. Thanks!