pythonlessons / mltu

Machine Learning Training Utilities (for TensorFlow and PyTorch)
MIT License
182 stars 108 forks source link

I want to increase learning_rate and train_workers, is that possible? #35

Open DucBac99 opened 11 months ago

DucBac99 commented 11 months ago

@pythonlessons please help me

pythonlessons commented 11 months ago

Which tutorial? Simply change these values and you good to go

DucBac99 commented 11 months ago

image I want to change the values in the config file to increase training speed, is it possible? @pythonlessons

pythonlessons commented 11 months ago

it really depends on your machine, if your GPU is used 100% batch size and workers number will not help, then you need to change model architecture or input size or other stuff

DucBac99 commented 11 months ago

image This is my computer configuration, I find the train is very slow, and it can only be turned off after more than 100 trains.

pythonlessons commented 11 months ago

So, you are training model on CPU. That's why it takes a lot of time. You should train it on GPU, if you don't have one try to use google collab and try training on free GPU

DucBac99 commented 11 months ago

image I'm also configuring training using GPU like you. I also tried using google collab but I found it too slow @pythonlessons

pythonlessons commented 11 months ago

I am configuring only ram, not reserving all possible GPU ram, you need to make sure your system can see GPU device (install drivers, CUDA, cuDNN).

DucBac99 commented 11 months ago

image I can see the GPU but when I run the train it still only runs through the CPU @pythonlessons

DucBac99 commented 11 months ago

image I noticed that if I trained more than 100 times, I would get this error. I also tried searching the val.csv file in the model and only got 1726 records while my dataset had more than 20 thousand images. image @pythonlessons please help me

pythonlessons commented 11 months ago

You received this error, because you try to convert model to onnx, but your GPU is too old to support it. You need to convert it manually using cpu onnx version. Your 20k images dataset should be split into train and validation, check how much data you had in your train.csv

DucBac99 commented 11 months ago

You received this error, because you try to convert model to onnx, but your GPU is too old to support it. You need to convert it manually using cpu onnx version. Your 20k images dataset should be split into train and validation, check how much data you had in your train.csv

I checked that the train runs entirely on CPU and not on GPU. I tried print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU'))) returned = 0, is there any way to fix it? fix this error?

DucBac99 commented 11 months ago

image this is the data in my train.csv

pythonlessons commented 11 months ago

solved?

DucBac99 commented 11 months ago

solved

oh no friend. I have fixed the error and trained with GPU, but when I train I am encountering an Early stopping error. I will train again and send you specific images. Hope you will help me

pythonlessons commented 11 months ago

Early stopping is not an error, its a function, if you model is not learning enough, increase early stopping patience, try playing around with learning rate or even model. Maybe you images are hard to crack

DucBac99 commented 11 months ago

Early stopping is not an error, its a function, if you model is not learning enough, increase early stopping patience, try playing around with learning rate or even model. Maybe you images are hard to crack

My captcha has 6 characters, do I need to change the model?

pythonlessons commented 11 months ago

no, you dont

DucBac99 commented 10 months ago

no, you dont

Can you solve the image pulling captcha and image rotation captcha