pythonlessons / mltu

Machine Learning Training Utilities (for TensorFlow and PyTorch)
MIT License
160 stars 100 forks source link

Problem with fit function #12

Closed lemonwilliam closed 1 year ago

lemonwilliam commented 1 year ago

Hi, thanks a lot for helping me, I'm really struggling with this homework.

I'm a CS student with pretty mediocre coding abilities and also new to deep learning, so I asked for help from some classmates who watched your tutorial and succeeded with your method, to work on this Captcha recognition project, but this is an issue that none of them have encountered.

I'm running my code on google colab, and here are some of details of my implementation:

training code: https://colab.research.google.com/drive/1scQlm4hHoxGjS74537kAcELKGrGdLKxA?usp=sharing mltu folder: https://drive.google.com/drive/folders/1V1ozlK1CmoYH8vSHuZaIeaHkII3NeioQ?usp=sharing dataset: https://drive.google.com/drive/folders/1o49WI0O4x1HIU54eFuhovvS1g0aK5UTo?usp=sharing

  1. I uploaded the dataset provided by my professor and the mltu-1.0.8 folder to my google drive

  2. I used these two lines of code for my colab notebook to gain access to my google drive: from google.colab import drive drive.mount('/content/drive/', force_remount=True)

  3. My classmates were using a linux environment with python version=3.9.16, and after some trial and error, they found that some additional libraries have to be installed, therefore the two cells in my training.ipynb file:

!pip install PyYAML>=6.0 !pip install tqdm !pip install pandas !pip install numpy !pip install opencv-python !pip install onnxruntime !pip install librosa==0.9.2 !pip install matplotlib !pip install onnx==1.12.0 !pip install tensorflow==2.10 !pip install tf2onnx

!apt-get install python3.9 !ln -sf /usr/bin/python3.9 /usr/local/bin/python !python --version

  1. The rest are some minor changes to the file paths(point to my goodle drive folder) and config parameters (self.vocab, self.height, self.width, etc.)

  2. All cells can run without encountering any errors until the last cell:

model.fit( train_data_provider, validation_data=val_data_provider, epochs=configs.train_epochs, callbacks=[earlystopper, checkpoint, trainLogger, reduceLROnPlat, tb_callback, model2onnx], workers=configs.train_workers )

where the error popped up: ValueError: Failed to find data adapter that can handle input: <class 'drive.MyDrive.mltu.dataProvider.DataProvider'>, <class 'NoneType'>

I'm guessing the problem came from environmental issues (eg. I didn'tproperly change the version of python, or the versions of python, tensorflow and keras are not compatible), but I'm really not sure (sorry for my lack of skills).

If you need any other details of my implementation to find out what caused the error, please let me know. I really appreciate the help since the deadline of this homework is near.

dtanhphuong1189 commented 1 year ago

I have your same issue. Have you fixed it? If you fix it, please share it with me

dtanhphuong1189 commented 1 year ago

https://drive.google.com/file/d/1UGk49m0qeAb8XMEFCeJq6n8hOEvfg_da/view?usp=drive_link I attached my google colab script. My script is about speech to text. I use data in Tutorials

Vào 03:30 PM, T.4, 7 Th6, 2023 Rokas @.***> đã viết:

can you give me code how you preprocess your images and annotations, so I can test it by my self

— Reply to this email directly, view it on GitHub https://github.com/pythonlessons/mltu/issues/12#issuecomment-1580195028, or unsubscribe https://github.com/notifications/unsubscribe-auth/AU5FLG6D6FK5TNT6HHCJGETXKA33ZANCNFSM6AAAAAAY5OV2QI . You are receiving this because you commented.Message ID: @.***>

pythonlessons commented 1 year ago

Ok, I tested it with mltu==1.0.10 you don't need to clone repository, you can type pip install mltu==1.0.10 (I'll release 1.0.11 version now with small fix)

annotations had few rows with not existing files, and in configs you were using set(...) that you shouldn't be using

this code worked to me:

import tensorflow as tf
try: [tf.config.experimental.set_memory_growth(gpu, True) for gpu in tf.config.experimental.list_physical_devices("GPU")]
except: pass

from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau, TensorBoard

from mltu.tensorflow.dataProvider import DataProvider
from mltu.tensorflow.losses import CTCloss
from mltu.tensorflow.callbacks import Model2onnx, TrainLogger
from mltu.tensorflow.metrics import CWERMetric

from mltu.preprocessors import ImageReader
from mltu.transformers import ImageResizer, LabelIndexer, LabelPadding
from mltu.augmentors import RandomBrightness, RandomRotate, RandomErodeDilate
from mltu.annotations.images import CVImage

from model import train_model
from configs import ModelConfigs

import os
import pandas as pd

df_train = pd.DataFrame(pd.read_csv("Datasets/train/annotations.csv"))

X_train = df_train["filename"].to_numpy()
y_train = df_train["label"].to_numpy()

dataset = []
for i in range(len(df_train)):
    image_path = "Datasets/train/" + X_train[i]
    if not os.path.exists(image_path):
        print("File not found: " + image_path)
        continue
    label = y_train[i]
    dataset.append([image_path, label])

configs = ModelConfigs()

# Create a data provider for the dataset
data_provider = DataProvider(
    dataset=dataset,
    skip_validation=True,
    batch_size=configs.batch_size,
    data_preprocessors=[ImageReader(CVImage)],
    transformers=[
        ImageResizer(configs.width, configs.height),
        LabelIndexer(configs.vocab),
        LabelPadding(max_word_length=configs.max_text_length, padding_value=len(configs.vocab))
        ],
)

# Split the dataset into training and validation sets
train_data_provider, val_data_provider = data_provider.split(split = 0.9)

# Augment training data with random brightness, rotation and erode/dilate
train_data_provider.augmentors = [RandomBrightness(), RandomRotate(), RandomErodeDilate()]

# Creating TensorFlow model architecture
model = train_model(
    input_dim = (configs.height, configs.width, 3),
    output_dim = len(configs.vocab),
)

# Compile the model and print summary
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=configs.learning_rate), 
    loss=CTCloss(), 
    metrics=[CWERMetric(padding_token=len(configs.vocab))],
    run_eagerly=False
)
model.summary(line_length=110)
# Define path to save the model
os.makedirs(configs.model_path, exist_ok=True)

# Define callbacks
earlystopper = EarlyStopping(monitor="val_CER", patience=50, verbose=1)
checkpoint = ModelCheckpoint(f"{configs.model_path}/model.h5", monitor="val_CER", verbose=1, save_best_only=True, mode="min")
trainLogger = TrainLogger(configs.model_path)
tb_callback = TensorBoard(f"{configs.model_path}/logs", update_freq=1)
reduceLROnPlat = ReduceLROnPlateau(monitor="val_CER", factor=0.9, min_delta=1e-10, patience=20, verbose=1, mode="auto")
model2onnx = Model2onnx(f"{configs.model_path}/model.h5")

# Train the model
model.fit(
    train_data_provider,
    validation_data=val_data_provider,
    epochs=configs.train_epochs,
    callbacks=[earlystopper, checkpoint, trainLogger, reduceLROnPlat, tb_callback, model2onnx],
    workers=configs.train_workers
)

# Save training and validation datasets as csv files
train_data_provider.to_csv(os.path.join(configs.model_path, "train.csv"))
val_data_provider.to_csv(os.path.join(configs.model_path, "val.csv"))

configs.py

import os
from datetime import datetime

from mltu.configs import BaseModelConfigs

class ModelConfigs(BaseModelConfigs):
    def __init__(self):
        super().__init__()
        self.model_path = os.path.join("Models/test", datetime.strftime(datetime.now(), "%Y%m%d%H%M"))
        self.vocab = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
        self.height = 96
        self.width = 96
        self.max_text_length = 4
        self.batch_size = 64
        self.learning_rate = 1e-3
        self.train_epochs = 150
        self.train_workers = 20

next time make sure you preprocess your data correctly

test it and let me know if I can close this issue

lemonwilliam commented 1 year ago

@pythonlessons It works perfectly with mltu==1.0.11 ! Thanks a lot for your help, you're a lifesaver.

pythonlessons commented 1 year ago

Nice!