This repo hosts a final DL project conducted as a part of data scientist certification at BIU
Tested with:
python 3.10.12 (must be active version on your system) - if you have another python version , use poetry update
instead of poetry install
during the step 5 - and does not commit poetry.lock afterwards!
poetry 1.8.3
pip 24.2
Pre-requirements:
clone the repo , go to the cloned directory
git clone git@github.com:lmanov1/DL_DiabeticRetinopathyStagePrediction.git
If you have GPU on your system : run CUDA setup: this should detect GPU and install supporting python system libraries (unsupported by poetry) like CUDA
python3 code/Util/check_hardware_and_install.py
Run poetry update
(just once) - this will not use poetry.lock of the workspace but will rewrite it. Don't commit your poetry.lock
Don't worry , without available GPU (and CUDA) , tensorflow, torch and rest of libraries leveraging GPU will automatically use the CPU.
Run poetry shell
About Kaggle API
We use Kaggle API to download datasets from Kaggle.
To use the Kaggle API, sign up for a Kaggle account at https://www.kaggle.com
Then go to the 'Account' tab of your user profile (https://www.kaggle.com/
Linux: $XDG_CONFIG_HOME/kaggle/kaggle.json
(defaults to ~/.config/kaggle/kaggle.json
). The path ~/.kaggle/kaggle.json
which was used by older versions of the tool is also still supported.
chmod 600 ~/.config/kaggle/kaggle.json
- no read access for other users.
Windows: C:\Users\
Other: ~/.kaggle/kaggle.json
You can define a shell environment variable KAGGLE_CONFIG_DIR to change this location to $KAGGLE_CONFIG_DIR/kaggle.json (on Windows it will be %KAGGLE_CONFIG_DIR%\kaggle.json).
You can also choose to export your Kaggle username and token to the environment: export KAGGLE_USERNAME=datadinosaur export KAGGLEKEY=xxxxxxxxxxxxxx In addition, you can export any other configuration value that normally would be in the kaggle.json in the format 'KAGGLE' (note uppercase).
Now you all set and can run project logics , for example
python3 /code/main.py
Running with poetry run python.exe /code/main.py
on Windows does problems with import fastai , so use just python.exe /code/main.py
Don't ask why.
Dev Environment Setup:
Determine available hardware (GPU) by manually running check_hardware_and_install.py from the code/Util folder. This will install CUDA where suitable.
Configure a virtual environment for package management. Please pay attention, both torch and TensorFlow do not maintain separate packages that depend upon underlying hardware( f.e. tensorflow-cpu and tesorflow-gpu) for a while. Starting tensorflow 2.17
There are some warnings in run time - that can be ignored.
Existing code should support both CPU/GPU environments, Windows, Linux, and (maybe 🙂) MAC
Download datasets from Kaggle: 2015 Diabetic Retinopathy Detection, APTOS 2019 Blindness Detection.
Organize the datasets in a structured format.
Clean and normalize images.
Implement data augmentation techniques.
Split data into training, validation, and test sets.
Define the PretrainedEyeDiseaseClassifier class with a pre-trained model (e.g., VGG16).
Modify the classifier layers to match the number of classes.
Define the main EyeDiseaseClassifier class using a custom CNN or another architecture
Initial Training with Pretrained Model:
Load data and create data loaders.
Initialize the pre-trained model (e.g., VGG16).
Train the pre-trained model with the available dataset.
Save the pre-trained model’s weights.
Main (inference) Model Training:
Load the pre-trained model’s weights into the main model.
Train the main EyeDiseaseClassifier model using the pre-trained weights for better performance.
Alternatively, the main model can be trained from scratch - to be decided
Save the main model’s weights after training.
Evaluate both models on the test set.
Calculate metrics like accuracy, precision, recall, F1-score, and ROC-AUC.
Save the trained models using appropriate file formats (.pth for PyTorch).
Develop an inference pipeline for classifying new images.
Hugging Face Deployment:
Integrate an LLM assistant to act as a virtual doctor.
Implement functionalities for anamnesis, analysis, and providing recommendations.
Extend the model to classify other eye diseases like cataracts and glaucoma.
Adjust the label set and retrain the model accordingly.
Maintain detailed documentation for each step.
Include README files, code comments, and usage instructions.
Dataloader.py
Methods:
data_preparation.py
This class code is based on fastai API, and optimized for use with labeled imaging data - utilizes data block, data loader
train_model.py
Is a base class that defines methods for training and evaluating a model, using a learner object Methods:
This class implements a vision classifier based on a publicly available pre-trained model and is seamlessly integrated with fastai - data block, data loader, and learner. This class can be used as a reference point for the performance of a model under development or/and for transfer learning (which utilizes weights and biases of this model on a model under development). Currently works with resnet18 or vgg16 , but It in general can be any model out of the collection supported by torchvision.models. Inherits parent classes torch.nn.Module and CustomModelMethods. Methods:
This is a generic CNN classification model that can be used for different eye disease diagnostics ( based on retina fundus images).This flexibility is due to the configurable number of classes for model use (num_classes = 5 (0..4) in case of Diabetic Retinopathy classification).
The model can be trained on different datasets, each with its relevant disease-related labels (based on ‘num classes’ parameter which defines the last decision layer shape).
The class uses Fastai data loaders that allow dataset iteration on (label, image) batches.
Inherits parent classes torch.nn.Module and CustomModelMethods.
Methods
Main Function: main()
Downloads the datasets. Currently works with
benjaminwarner/resized-2015-2019-blindness-detection-images
See Define the dataset names and paths
section in code/main.py for more details , there are more available datasets available in Kaggle that we can use.
Prepares the DataLoaders
Trains and evaluates both pre-trained and inference models, saves models under data/output as *.pth (torch model format). These files kept local as it can be too big to be uploaded to github.
On subsequent run , once pretrained model (previously fine tuned on specific dataset) is found under the data/output/datasetname_pretrained.pth
, it will not be run again.
Instead, training a CNN model (which currently trained with learner's fit_one_cycle()) will be run over and over again.
Uses Fastai's Learner class to handle training and evaluation
This flow is the very first draft , to be improved
MultiPlatform.py
poetry export -f requirements.txt --output requirements.txt
app.py - main gradio application script.
Leverages existing
update_production.py - upload relevant files from git to production space (https://huggingface.co/spaces/Lmanov1/timm-efficientnet_b3.ra2_in1k
)
For inference models persistant storage - we use a dataset repo (https://huggingface.co/datasets/Lmanov1/BUI17_data
)
To run update_production.py
there is a need to login to Hagging face with a valid Hugging face API TOKEN. Currently token value being read from .env file where it should be stored in format: MY_TOKEN="PUT HERE YOUR KEY". This file is not managed by git , but local in the root directory of the project.