coqui-ai / STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
https://coqui.ai
Mozilla Public License 2.0
2.27k stars 275 forks source link

Feature request: Tensorflow 2.0 compatibility #2319

Closed Baerlie closed 1 year ago

Baerlie commented 1 year ago

Hi,

I'm currently using Coqui STT for my Master's thesis project. I tried to run several of the example scripts in Colab, but Colab does not support Tensorflow 1.x any more. Installation of the coqui_stt_training library is not possible, because of the Tensorflow 1.x dependency.

Do you plan to upgrade the STT sources to work with Tensoflow 2.x? I tried the tensorflow=1.x magic function, but this is depreciated as well.

br Beatrice

HarikalarKutusu commented 1 year ago

Hey Beatrice, in fact it does work with the new Colab, but not in the old way. It works,, but you need to restart the kernel.

Here is my test script I compiled after this thing came out. Run it step by step and before the last step it shows you a button to restart the kernel. After that you continue from where you left. Do not initialize any variables before that point, they will be lost (or if you have to, you need to redefine them).

https://colab.research.google.com/drive/1CfZbtNLht4h0ShOJR1qUqucNg893rsOP?usp=share_link

AFAIK, there is no immediate plan to update the code to TF v2 as it would be a huge undertaking.

Bülent

Baerlie commented 1 year ago

Hey Bülent,

thanks for your answer and sharing your notebook! It seems they changed a lot. pip on the new Colab does not find any version below 2 any more:

ERROR: Could not find a version that satisfies the requirement tensorflow==1.15.4 (from coqui-stt-training) (from versions: 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0) ERROR: No matching distribution found for tensorflow==1.15.4

This error occurs when installing the STT from git and also when trying to manually install tensorflow in a later cell. I try to install it from source, maybe this works.

Beatrice

HarikalarKutusu commented 1 year ago

I found the reason. They switched the image to use Python 3.8 which only supports TF 2.2+

I checked, they have Python 3.6 installed. Can you try to create a virtual env to use Python 3.6?

Baerlie commented 1 year ago

Hey Bülent,

thanks for the hint with the installed python version! I tried with the virtual environment, it works, but then the following error occurs: ERROR: Package 'coqui-stt-training' requires a different Python: 3.6.9 not in '<3.9,>=3.7'

I try to install version 3.7 when I have time to see what happens. Thanks for helping, though!

br Beatrice

HarikalarKutusu commented 1 year ago

For a quick check you could use an older version of STT. I couldn't find the exact point where Python 3.6 support is dropped, but you can use v1.0.0 for example. Wrt. underlying DeepSpeech model nothing changed, but there have been changes to parameters, so use the related version documentation for parameters.

Baerlie commented 1 year ago

Ok, thanks for the information! I can confirm, that installing python 3.7 on Colab and creating an virtual env with 3.7 sucessfully installs tensorflow 1.15.4 and STT 1.4.0.

Meanwhile, I finished the grid search on my local machine and it seems that increasing the train batch size leads to a higher WER and CER so I will go for a lower batch size with my audio data samples.

I can train up to batch size 16 with my graphics card, I wanted to include a batch size of 32 as well as a comparison in my thesis, but I can do that later if I have time :)

HarikalarKutusu commented 1 year ago

It would be very nice if you can share the relevant cells here for future questions.

it seems that increasing the train batch size leads to a higher WER and CER

That might be data dependent. I did a similar test last year, the results were not conclusive/erratic, but anyway, I share them here:

image

As you can see with training batch size set to 32 (and 16), Best Epoch was reached too early. I did not check the loss graphs at that time, but maybe you should for the thesis to pinpoint possible overfitting etc.

Baerlie commented 1 year ago

Sure, here you go: https://colab.research.google.com/drive/1mLXfqVXIQLbgyfa2pXzVay0fWoh9Geod?usp=sharing I'm not a heavy Colab user, somehow one has to activate the virtual env for every cell, I think.

Thanks for sharing your results! I don't have mine ready yet (but I'm happy to share the thesis once it's finished).

HarikalarKutusu commented 1 year ago

Thank you for sharing the solution Beatrice.

AFAIK, in Colab (probably in all iPython/Notebook implementations), each cell starts a new shell, thus you need to re-activate. I found out that defining def:'s and calling them in succession from a single cell makes it easier. It kinda defeats the purpose of using a Notebook and results in many linting underlines, but works.

Good luck with your thesis :)

fquirin commented 1 year ago

AFAIK, there is no immediate plan to update the code to TF v2 as it would be a huge undertaking

I hate it when that happens :-/ and since almost all frameworks released in the last years are basically beta versions that constantly introduce breaking changes and drop backwards compatibility it has become the everyday nightmare of programmers 😞. The question is how long will you be able to live with TF < 2 :-|.

Maybe the tf.compat module can help?

I'm already facing a situation where I need to use two libraries in the same program (one being Coqui) and one requires TF 2 :-(

wasertech commented 1 year ago

You can totally use TFv2 for inference already.

Training is another beast in it-self.

The question is how long will you be able to live with TF < 2 :-|.

As long as we can't use TFv2 for training. We need some very specific requirements to train models which TFv2 is lacking for now.

I'm already facing a situation where I need to use two libraries in the same program

You shouldn't mix dependencies like that. Training should be performed inside its own dedicated environment. Meaning you should have one notebook for training using STT, and create other notebooks for your other needs.

@HarikalarKutusu can tell you that notebooks are not made for you to train your models. They are good tools to learn and play with code but not to seriously produce models at scale.

If you followed the docs, we actually recommend you use our docker image to train your models, as it's the easiest way to train and comes out-of-the-box with everything you would need to fully train your models.

We suggest you use our Docker image as a base for training.

I'll move this ticket to a discussion as there is really not much we can do about it. We have made some progress towards it but there is still a long way before we can fully use TF2 as base for training.