Borda / kaggle_image-classify

Various Kaggle image classification challenges solutions
https://borda.github.io/kaggle_plant-pathology
MIT License
41 stars 12 forks source link
colab dataset image-classification kaggle-competition lightning-flash notebooks plant-pathology pytorch-lightning

Kaggle: Image classification challenges

CI complete testing codecov pre-commit.ci status

Experimentation

install this tooling

A simple way how to use this basic functions:

! pip install https://github.com/Borda/kaggle_image-classify/archive/main.zip

Kaggle: Herbarium 2022

The Herbarium 2022: Flora of North America dataset comprises 1.05 M images of 15,501 vascular plants, which constitute more than 90% of the taxa documented in North America. The provided dataset is constrained to include only vascular land plants (lycophytes, ferns, gymnosperms, and flowering plants) and it has a long-tail distribution. The number of images per taxon is as few as seven and as many as 100 images. Although more images are available.

Sample images

run notebooks in Kaggle

run notebooks in Colab

some results

Training progress with EffNet-b3 with training for 10 epochs:

Training process

Kaggle: Plant Pathology 2021 - FGVC8

Foliar (leaf) diseases pose a major threat to the overall productivity and quality of apple orchards. The current process for disease diagnosis in apple orchards is based on manual scouting by humans, which is time-consuming and expensive.

The main objective of the competition is to develop machine learning-based models to accurately classify a given leaf image from the test dataset to a particular disease category, and to identify an individual disease from multiple disease symptoms on a single leaf image.

Sample images

run notebooks in Kaggle

run notebooks in Colab

I would recommend uploading the dataset to you personal gDrive and then in notebooks connect the gDrive which saves you lost of time with re-uploading dataset when ever your Colab is reset... :]

some results

Training progress with ResNet50 with training for 10 epochs > over 96% validation accuracy:

Training process

More reading

Kaggle: iMet Collection 2021 x AIC - FGVC8

The online cataloguing information is generated by subject matter experts and includes a wide range of data. These include, but are not limited to: multiple object classifications, artist, title, period, date, medium, culture, size, provenance, geographic location, and other related museum objects within The Met’s collection. Adding fine-grained attributes to aid in the visual understanding of the museum objects will enable the ability to search for visually related objects.

Sample images

run notebooks in Kaggle

run notebooks in Colab

I would recommend uploading the dataset to you personal gDrive and then in notebooks connect the gDrive which saves you lost of time with re-uploading dataset when ever your Colab is reset... :]

some results

Training progress with ResNet50 with training for 35 epochs and subset labels with ore then 100 samples:

training on 100 samples per class

Kaggle: Cassava Leaf Disease Classification

The task is to classify each cassava image into five categories indicating - plant with a certain kind of disease or healthy leaf.

Organizers introduced a dataset of 21,367 labeled images collected during a regular survey in Uganda. Most images were crowd-sourced from farmers taking photos of their gardens, and annotated by experts at the National Crops Resources Research Institute (NaCRRI) in collaboration with the AI lab at Makerere University, Kampala.

Sample images

run notebooks in Colab

I would recommend uploading the dataset to you personal gDrive and then in notebooks connect the gDrive which saves you lost of time with re-uploading dataset when ever your Colab is reset... :]

some results

Training progress with ResNet50 with training for 10 epochs:

Training process