[DMP 2024]: Automated training and inference pipeline for simple/custom models

Gautam-Rajeev commented 7 months ago

Goal:

Create compact TensorFlow Lite (TFLite) models that can be deployed on mobile devices for offline use, specifically for agricultural pest detection and document handling tasks. These models must be tiny, with a file size of <= 10MB, to facilitate easy integration into mobile applications.

Description:

The project involves developing two sets of TinyML models. The first set targets the agricultural sector, focusing on pest detection through image analysis. The second set aims at document detection and processing, including blur detection, alignment correction, and document type classification.

Agricultural Model:

Pest Detection: Develop a TinyML model to analyze close-up images of crop leaves to determine their suitability for pest detection. The model should discern whether an image is a close-up of a leaf, which is essential for accurate pest detection, and not an image capturing the entire plant from a distance.

Document Detection Support Models:

Blurry Image Detection: A TinyML model that determines if an image of a document is blurry, ensuring only clear images proceed for further processing.
Alignment Correction: Design a model that corrects alignment issues in images of documents, such as PAN cards, transforming skewed images into correctly aligned rectangles.
Document Type Classification: Develop a TinyML model capable of classifying images into document types, e.g., PAN card, Aadhar card, etc.

Implementation Details:

Utilize TensorFlow Lite for model development, ensuring the models are optimized for low resource consumption.
The models should be tested extensively on diverse datasets to ensure robustness and accuracy.
Special consideration must be given to the model architecture to maintain a balance between performance and model size, with a strict size limit of < 10MB.

Contributors are encouraged to share their progress, challenges, and insights through comments. Collaborative efforts are highly appreciated. The contribution deemed most effective and efficient will lead to further discussions and potential project assignment.

Product Name

ai-tools

Organization Name

Samagra

Domain

Agriculture

Tech Skills Needed

TensorFlow Lite
Python
Machine Learning
Image Processing
Data Augmentation

Mentor(s)

@ChakshuGautam

Complexity

Medium

bruhathisp commented 7 months ago

I'm Bruhathi, one of the participants in Code4GovTech. As I'm keen to contribute to this repository, I would greatly appreciate some clarification and guidance to ensure my efforts align with the project's goals effectively.

Specifically, I intend to focus on the implementation of blurry image detection for the document detection support models. To proceed efficiently, I kindly request the following details:

Project Setup: Could you please advise if it's okay to fork the project and create a new folder for the blurry image detection module?
Data Availability: Are there any datasets available or recommended for training the blurry image detection model?

Azazel0203 commented 7 months ago

Hello @GautamR-Samagra,

I'm seeking clarification regarding the project, particularly regarding model development for agriculture.

The project outline specifies the use of close-up images of plant leaves. I've found a dataset containing leaf images like the examples below:

0b37761a-de32-47ee-a3a4-e138b97ef542___JR_FrgE S 2908 1a4e1884-ab1a-4fe9-afe2-610ae9aa1162___JR_FrgE S 2826 0c83f4eb-4949-47c8-bf63-312d16e64913___RS_HL 7619

Would you recommend proceeding with this dataset, or is there another preferred option?

Thankyou

kartikbhtt7 commented 7 months ago

Hello, are there any leads for the dataset? while searching online I found this dataset https://www.kaggle.com/datasets/pbrant/text-image-with-motion-blur it's having around 360 of blurred and 180 clear images similar to text documents(books in this case).

ig we can also create our own blurred images dataset using open cv (similar using "cv2.GaussianBlur(image, (n, n), 0)" or other blurring techniques) provided we have clear images dataset

swarnim-sawane commented 7 months ago

Clarification on Model Size Limit and Key Considerations

Hi @ChakshuGautam @GautamR-Samagra ,

Quick question regarding the model size requirement: Should each individual model (pest detection, blurry image detection, alignment correction, document type classification) be ≤ 10MB, or is this the collective size limit for all models?

Also, any specific optimization strategies (compression, quantization) or hardware considerations we should keep in mind for mobile deployment?

Thanks in advance for the insights!

Swarnim Sawane

ashuashutosh2211 commented 7 months ago

Hi @ChakshuGautam @GautamR-Samagra , I am Ashutosh, a prefinal year student at IIT Jodhpur pursuing B.Tech. in Artificial Intelligence and Data Science. I have done projects related to deep learning and machine learning. I have worked in projects like Stock Price Prediction , Speech Emotion Recognition, Voice Controlled Music Recommendation System using Deep Learning. As part of my deep learning course lab work, I have worked on deep learning models on images also. I am interested in this project. For this project I think we can try with multiple model architectures with a combination of CNN, ResCNN, UNets etc as well as using some pre-trained models. We can consider similar problems which are already solved and what models were used for that and reading different research papers to finalize the model architectures. Later on they can be fine tuned on multiple datasets. Can you guide me furhter about the dataset and the project set up, so that I can contribute to this project ?

kartikbhtt7 commented 7 months ago

Hello, @ChakshuGautam @GautamR-Samagra I tried out modelling on

Blurry Image Detection: A TinyML model that determines if an image of a document is blurry, ensuring only clear images proceed for further processing.

I simply implemented a SVM classifier having the image input gone via a pipeline for edge extraction that comprised of Sobel, Roberts and Laplacian filter, classifying the images to Blur/NotBlur by using 3filters(Sobel, Roberts, Laplacian) and 3 features(mean, max, variance) corresponding to each image, resulting in total of 9 features per image.

The results are a follows:

results___11_0 results___14_0

ig if more accurate and better model is required we can try on some more complex models/architectures on the proposed pipeline. Thanks a lot

ps - I used the above dataset that I've mentioned, Thanks to the article too https://medium.com/data-science-ecom-express/a-simple-approach-for-blur-image-detection-535b3c55b596

Elisettygnanesh commented 7 months ago

Hello, @ChakshuGautam @GautamR-Samagra My name is Elisetty Gnanesh, I came up with problem and solution of this project. Please look into this once.

Problem:- The project encounters various challenges including acquiring diverse and high-quality datasets for pest detection and document handling, balancing model complexity with size constraints (<10MB), managing limited computational resources on mobile devices, ensuring model robustness across different contexts, determining appropriate evaluation metrics, integrating TFLite models into mobile applications efficiently, addressing ethical and privacy concerns related to sensitive data, and maintaining compliance with regulations. Overcoming these challenges requires a multidisciplinary approach, collaboration with domain experts, and adherence to ethical guidelines throughout the project lifecycle.

Solution:- Solutions include augmenting datasets through transfer learning and crowd-sourcing, employing model pruning and quantization to manage complexity within size constraints, optimizing models for mobile deployment via architecture optimization and hardware acceleration, applying domain adaptation techniques for robustness across contexts, selecting appropriate evaluation metrics tailored to each task, and integrating TFLite models into mobile apps efficiently through platform-specific optimizations. Additionally, implementing robust data encryption and access controls can address ethical and privacy concerns, while fostering collaboration and transparency among interdisciplinary teams ensures effective problem-solving and successful outcomes.

AbhimanyuSamagra commented 7 months ago

Do not ask process related questions about how to apply and who to contact in the above ticket. The only questions allowed are about technical aspects of the project itself. If you want help with the process, you can refer instructions listed on Unstop and any further queries can be taken up on our Discord channel titled DMP queries. Here's a Video Tutorial on how to submit a proposal for a project.

07-Atharv commented 7 months ago

Hello @ChakshuGautam @GautamR-Samagra I am student of the Computer Science and specialization with Artificial Intelligence and machine learning , I have worked on different image processing , machine learning , data analysis projects. I am interested to contribute this repository.

Azazel0203 commented 7 months ago

Hey @ChakshuGautam @GautamR-Samagra,

I've developed a collaborative notebook focused on training a model for leave Images. However, due to the extensive dataset I'm working with (54305 images), each epoch takes about 15 minutes, making it challenging to utilize the free Google Collab for extended training sessions.

For now, I've conducted a dummy run with just one epoch, resulting in a sizable TensorFlow Lite model of approximately 14MB. With further experimentation, I believe we can reduce its size even more through trial and error.

I've included the link to the Collab notebook below. Your insights and suggestions on how to proceed would be invaluable.

Looking forward to your feedback. Thanks a bunch!

Collab Notebook Link

Gautam-Rajeev commented 7 months ago

Hey @ChakshuGautam @GautamR-Samagra,

I've developed a collaborative notebook focused on training a model for leave Images. However, due to the extensive dataset I'm working with (54305 images), each epoch takes about 15 minutes, making it challenging to utilize the free Google Collab for extended training sessions.

For now, I've conducted a dummy run with just one epoch, resulting in a sizable TensorFlow Lite model of approximately 14MB. With further experimentation, I believe we can reduce its size even more through trial and error.

I've included the link to the Collab notebook below. Your insights and suggestions on how to proceed would be invaluable.

Looking forward to your feedback. Thanks a bunch!

Collab Notebook Link

Hi, this looks promising. We were internally only considering a small model to check if the image of the leaf has been clicked properly; no need of the pest/image detection itself- could use a bigger model for that. The results do look good on the tiny model for classification itself, so I think we can take a crack at creating an entire model.

AyushSarangi commented 7 months ago

Hi @ChakshuGautam @GautamR-Samagra

I am Ayush Sarangi, a pre-final year student at IIT, Varanasi. I have a deep interest in problems related to computer vision, which brings me here. I trained a model on around 17k images that can detect whether plants have disease based on the image of leaves. I trained the model for three epochs on Google Collab because of resource constraints and achieved around 71% accuracy. This accuracy can be further increased without overfitting.

For your reference, I have attached the link to the notebook below. I am eagerly waiting for suggestions to improve it further. Thank you

notebook - https://colab.research.google.com/drive/1LxrZxS7Geq2PNPpysHSm0a4AMC0xeO2C?usp=sharing dataset - https://www.kaggle.com/datasets/arjuntejaswi/plant-village

Mithilesh1609 commented 6 months ago

Hey @dennyabrain, Mithilesh here, I have experience and passion for creating end-to-end, highly scalable computer vision pipelines, I am working with a young start-up as a machine learning engineer.

I worked as a led in creating and scaling the computer vision-based exam grading tool with normal mobile photos that require robust document preprocessing like blur detection, shadow removal, Alignment of the image and then checkbox, and number detection for 40 million papers with docker and AWS while maintaining >95% accuracy, and just completed offline APP development which is going to use grade more than 120 million photos a year, where I bring down bigger ml model(>100 MB in size) to tiny(<5MB) while retaining a similar level of accuracy and 20% decrease in inference time.
I have worked with crop image data from drones to predict disease from the leaf, by creating a method to first identify leaf area and crop the leaf and predict on each leaf, and then aggregate areas where diseases are there.
I am very eager to contribute in this project and make a positive impact and learn new things from the best in the field.

R-V-J commented 6 months ago

Hello @ChakshuGautam and @GautamR-Samagra, I am Rushi Jani, a pre-final year B.Tech. student at Veermata Jijabai Technological Institute (VJTI), Mumbai. I am keenly interested in different advanced applications revolving around Machine Learning and Image Processing. My decent exposure to tinyML and quantization techniques using TensorFlow lite during my winter internship attracts me towards this project as well. During that project, I worked on a fruit classification model using a transfer learning technique which was then deployed on the available hardware, thus giving me an end-to-end experience of an industrial application. With this, I am truly passionate about kick-starting my open-source journey with this project and would like to get into the initial tasks/issues to solve and simultaneously go through a few technical papers to get a better analysis of the work going on over it.

A-01-hub commented 6 months ago

Hi @ChakshuGautam @GautamR-Samagra , I am Aditya Suahne a sophomore student at Gyan Ganga Institute of technology and SCience pursuing B.Tech. in Data Science. I have done projects related to deep learning and machine learning. I have worked in projects like old Car Price Prediction , Speech Emotion Recognition,Malaria Diagonis using Deep Learning. I have worked on deep learning models on images also. I am interested in this project. For this project I think we can try with multiple model architectures with likw CNN, ResCNN, UNets,Imagenet etc as well as using some pre-trained models. Apart from we can integrate this model with flutter app it take the image tell pest is there or not I am looking for your guidance

dcsgod commented 6 months ago

hello @ChakshuGautam i am Ravi kumar, working on this project prior i have developed a same type project plant disease identifier in Smart india hackathon so my question is would you provide any dataset or should we work on mine

nsadana60 commented 6 months ago

Hi @ChakshuGautam @GautamR-Samagra , I'm Sadana grauated from Anurag grup of Institutions,in Electronics and communication Engineering.I want to contribute in this Project.

krishnarathore12 commented 6 months ago

Hello @ChakshuGautam @GautamR-Samagra I am Krishna Rathore undergraduate student at IIT Patna. I have a deep passion for AI and also recent advancements in NLP make me wonder about the future of AI.

worked under a Professor in the top 2% researchers list by Stanford
Smart India Hackathon 2023 finalist
Bronze medal holder in Inter-IIT Tech Competition 2023

Dataset: https://www.kaggle.com/datasets/mehaksingal/personal-identification-image-dataset-for-india

Created a classifier using CNNs in tensorflow lite categorizing documents into 6 categories and got around 82% accuracy in the test dataset.

Gautam-Rajeev commented 5 months ago

[x] SVM for blurry image detection -- 1
[x] document contour detection by feature matching
[x] alignment correction for doc (across 3 dimensions)
[x] document detection - using feature mapping - for some pre-defined govt docs
[x] OCR based keyword detection for docs
[ ] Sharpening / other pre processing required before OCR
[ ] Create a benchmark for OCR /entity detection for a list of documents tagged with metadata like 'disoriented', blurry etc
[ ] Quantization and HFLite conversion
[x] Entity detection for documents - detect dates from cheque etc
[ ] Pest detection - checking if the image is correct format for pest detection
[ ] Pest detection - object detection + classification

ACC :

fine tune whisper for judgement audios-

[ ] create a pipeline to get ~ accurate transcripts by using SOTA ASR models + GPT to correct text
[ ] Use Force alignment models to extract required audio data (created by @xorsuyash)
[ ] Fine tune whisper on Legal audio data for English
[ ] Fine tune whisper with numeric tokens only for Aadhar number detection

Gautam-Rajeev commented 4 months ago

Mid point demo showcase :

tiny model for improving image pre-processing:

[ ] Blur detect (motion) mention size etc .. show it through API
[ ] Rotation detect
[ ] Alignment correction

OCR + entity detection big model training (add to autotrain)

[ ] OCR matching
[ ] cheque number detection

Audio processing :

[ ] Whisper training add to audotrain
[ ] Showcase imrovement before and after training on legal domain

Gautam-Rajeev commented 3 months ago

Weekly Goals

Week 1

[ ] Creating model for blur detection
[ ] Creating model for contour detection

Week 2

[ ] Creating model rotation of doc
[ ] OCR for docs

Week 3

[ ] Creating entity extraction model using Florence

Week 4

[ ] Creating pest distance model

Week 5
[ ] Adding above model to ai-tools

Week 6

[ ] Adding Whisper to Auottune

Week 7

[ ] Adding quantization and ONNX to autotune

Week 8

[ ] Adding BGE embedding fine tuning to Autotune

Week 9

[ ] Adding reranker finetuning to Auottune

Week 10

[ ] Setting up ONNX model inference on Krishi Sahayak app

Week 11

[ ] Creating setup to create synthetic data for training all models in RAG pipeline for each message

Week 12

[ ] Documentation

kartikbhtt7 commented 3 months ago

Weekly Goals

Week 1

[x] Creating model for blur detection
[x] Creating model for contour detection PR Links : Blur Detection Contour Detect

Week 2

[x] Creating model rotation of doc
[x] OCR for docs PR Links: Rotation Model OCR

Week 3

[x] Creating entity extraction model using Florence PR Links: Entity Extraction & Match

Week 4

[x] Creating pest distance model Issue Link: Pest Distance Model

Week5

[x] Adding above model to ai-tools

Week6

[x] Adding Whisper to Autotune PR Links: Whisper Autotune

Week 7

[x] Adding quantization and ONNX to autotune PR Links: Quantization [ONNX]()

Samagra-Development / ai-tools