pyronear / pyro-mlops

3 stars 4 forks source link

Pyronear - Machine Learning Pipeline for Wildfire Detection 🚀

Train YOLO Wildfire Detector Using Ultralytics

This repository showcases how to train a YOLOv8 deep learning model on the Pyronear dataset. Key features include the use of DVC for data versioning and MLflow for model versioning and performance tracking, with cloud storage for data.

ML flow

Prerequisites

01. Install Dependencies 📦

To install necessary libraries, run:

pip install -r requirements.txt

02. Download and Prepare Dataset 📥

  1. Download the dataset using the following command:

    gdown --fuzzy https://drive.google.com/file/d/12gGuFd3aQmtPXP-cbBRjsciWLtpFNBB-/view?usp=sharing
  2. Unzip and organize the dataset:

    mkdir datasets
    unzip DS-18d12de1.zip -d datasets/
  3. Update the dataset path in data_configuration.yaml.

03. Data Overview 💽

The dataset comprises 596 training images and 148 validation images featuring forest landscapes with smoke. Each image (640x480 pixels) is annotated with a bounding box in a corresponding txt file, marking the smoke areas.

Fumes

04. Data Version Control (DVC) 🔄

1️⃣ Install DVC:

Use the same requirements file to install DVC.

2️⃣ Data Setup:

05. Model Training with YOLOv8 and MLflow Tracking 🤖

⚠️ Requires GPU.

3️⃣ MLflow Setup:

MLflow is used for experiment tracking and model management. Key tracked metrics include epochs, accuracy, and loss.

4️⃣ Training the Model:

Execute the training script with specified data and model configurations:

python3 train_yolo.py --data_config data_configuration.yaml --model_config model_configuration.yaml

5️⃣ Cloud Storage for Artifacts:

Add AWS credentials to the training script:

"s3", aws_access_key_id="your_access_key_id", aws_secret_access_key="your_secret_access_key"

Congratulations! 🎉

You've successfully set up and run the Pyronear machine learning pipeline for wildfire detection.