Mildew, commonly known as powdery mildew, is a fungal disease that affects various plants, including cherry trees. It thrives in warm, humid conditions and appears as a powdery, white growth on the leaves, shoots, and sometimes the fruit of trees. This fungal infection typically begins in early summer, spreading through spores carried by wind or water, causing leaf distortion, premature leaf drop, and weakening the tree's overall health. Cherry trees, susceptible to powdery mildew, can suffer reduced photosynthesis and fruit quality when infected, affecting their growth and productivity. Regular pruning, proper air circulation, and fungicidal treatments can help prevent and manage mildew infestations in cherry trees.
The goal was to create a ML model that can predict if a cherry leaf is infected with mildew. It had to have a accuracy rate of minimum 97%. My model reached a accuracy rate above 99%.
The cherry plantation crop from Farmy & Foods is facing a challenge where their cherry plantations have been presenting powdery mildew. Currently, the process is manual verification if a given cherry tree contains powdery mildew. An employee spends around 30 minutes in each tree, taking a few samples of tree leaves and verifying visually if the leaf tree is healthy or has powdery mildew. If there is powdery mildew, the employee applies a specific compound to kill the fungus. The time spent applying this compound is 1 minute. The company has thousands of cherry trees, located on multiple farms across the country. As a result, this manual process is not scalable due to the time spent in the manual process inspection.
To save time in this process, the IT team suggested an ML system that detects instantly, using a leaf tree image, if it is healthy or has powdery mildew. A similar manual process is in place for other crops for detecting pests, and if this initiative is successful, there is a realistic chance to replicate this project for all other crops. The dataset is a collection of cherry leaf images provided by Farmy & Foods, taken from their crops.
Visual Differentiation: Visualize cherry leaf images to differentiate healthy leaves from those affected by powdery mildew. This aids in understanding the distinct features, if any, present in affected leaves.
Prediction: Develop an ML model to predict whether a cherry leaf is healthy or has powdery mildew based on visual cues extracted from the images.
Data Visualization: Visual exploration of the dataset helps in understanding the characteristics and variations present in healthy and affected cherry leaves. This aids in feature selection and extraction for model development.
ML Tasks: Using CNNs leverages the capability of deep learning to automatically learn and identify intricate patterns in images, enabling accurate classification between healthy and affected leaves.
Provides a detailed overview of powdery mildew, a fungal disease affecting cherry trees, including its characteristics, impact, and management strategies.
Describes powdery mildew as a fungal disease affecting various plants, particularly cherry trees, thriving in warm, humid conditions.
Explains its appearance as a powdery, white growth on leaves, shoots, and occasionally on the fruit.
Specifies that the dataset is sourced from Kaggle.
Indicates that the dataset showcases both healthy cherry leaves and leaves affected by powdery mildew.
Encourages readers to visit and read the Project README file for further information.
Highlights the project's two primary business requirements:
1 - Conducting a study to visually differentiate between healthy cherry leaves and those infected with powdery mildew.
2 - Developing a predictive capability to determine if a cherry leaf is healthy or infected with powdery mildew.
Sourced from Kaggle.
Contains over 4,000 images captured from the client's crop fields.
Displays both healthy cherry leaves and leaves affected by powdery mildew.
Encourages users to explore and read the README file for a deeper understanding.
1 - The client is interested in conducting a study to visually differentiate a healthy cherry leaf from one with powdery mildew.
2 - The client is interested in predicting if a cherry leaf is healthy or contains powdery mildew.
Cherry leaves affected by powdery mildew exhibit distinguishable visual patterns, especially along the leaf edges, setting them apart from healthy leaves.
Emphasizes the identification of these distinct patterns on the leaf edges.
Proposes the validation process through exploratory data analysis (EDA) on the dataset.
Aims to visualize samples of healthy and affected cherry leaves to discern any noticeable patterns or differences.
This page computes healthy and mildew infected leaves. This helps the user to easier visualize the shape and differences between the leaves. It also computes a montage of leaves for the user to view.
Noticeable differences between average and variability images are observed.
Shapes are recognizable in both mildew-infected and healthy leaves.
Distinct color differences between healthy and mildew-infected leaves are noticeable.
Healthy leaves exhibit a greener appearance.
Users can refresh the montage by clicking the 'Create Montage' button.
Allows selection of labels for viewing different subsets of images.
Displays a grid of randomly selected images based on the chosen label.
Adjusts the number of rows and columns to create montages with specific sizes.
The client's objective is to determine whether a given leaf is infected with mildew.
Allows users to upload mildew-infected leaf samples for live prediction.
Provides the option to download a set of mildew-infected and healthy leaf images from Kaggle.
Users can upload one or multiple leaf images for analysis.
Displays details about the uploaded image(s) like name and size.
Resizes the uploaded image(s) for input into the prediction model (resize_input_image function).
Uses a pre-trained model to predict the probability and class of mildew infection (load_model_and_predict function).
Visualizes predictions and their probabilities (plot_predictions_probabilities function).
Generates an analysis report table displaying the uploaded image names and their corresponding prediction results.
Provides an option to download the analysis report as a CSV file.
Displays an image depicting the distribution of labels across train, validation, and test sets.
Visualizes the model's training accuracy and losses over time.
Includes separate images for the model's training accuracy (model_training_acc.png) and training losses (model_training_losses.png).
Presents a tabulated view of the model's evaluation metrics on the test set.
Loads and displays the evaluation results for loss and accuracy obtained from the load_test_evaluation function.
During the development of this project several issues occured. Here are some of them.
Jupyter Notebooks:
During the development of the notebooks my system seemed to be struggling the most. I believe thats because of issues with the recommended IDE. When going through the steps my system got disconnected multiple times. This explains the versioning in the notebooks. The steps had to be done multiple times for me to get a version I felt satisfied with. Especially the last notebook which my IDE seemed to be struggling with the most. In the end to resolve the issues I had been having I switched the IDE in order to finish the project.
Streamlit Dashboard:
During the development of the dashboard I had several issues. The first ones had to do with library dependencies. Either it was tensorflow that was not compatible or it was numpy. To resolve this I checked the versions installed and tried to search for versions that are compatible. I updated the versions trying to find versions that worked. In the end I found that the version of numpy in the requirement.txt file and a updated tensorflow resolved the issue.
In the mildew_detector.py page I encountered several issues. When feeding images to the feature it came back with the wrong prediction. At first I wondered about my ML model and the outcome. Perhaps I had saved a underperforming model or interpret the results wrong. But the model predicted the result wrong with such accuracy that it made me sure it had to be a wiring issue in the predictive_analysis.py file. Sure enough it was the index positions that I had entered wrong. When fixed the detector model worked perfectly.
On the mildewdetector.py page in my dashboard there is a download feature in the end. When trying to perform this task I got error message about the append attribute. The feature did not recognize append. When googling the issue I found the answer on stackoverflow. It seemed that the new version of pandas requires the append feature to have a in front. So I changed append to _append and it worked perfectly after that.
Heroku:
During the deployment to I encountered several issues. My app deployed on streamlit worked as expected however when deployed to Heroku I ran into dependency issues. It could not find the load_model created with tensorflow. After upgrading/downgrading my tensorflow packages I still couldn´t solve the issue. Because I had been forced to change IDE during development dependency issues had arrived. So to solve this I ran the jupyter notebooks again from the beginning in one push on Gitpod. I saved it and pushed it to a new branch and it worked as expected after that.
https://en.wikipedia.org/wiki/Powdery_mildew