abhisheks008 / DL-Simplified

Deep Learning Simplified is an Open-source repository, containing beginner to advance level deep learning projects for the contributors, who are willing to start their journey in Deep Learning. Devfolio URL, https://devfolio.co/projects/deep-learning-simplified-f013
https://quine.sh/repo/abhisheks008-DL-Simplified-499023976
MIT License
391 stars 339 forks source link

Diabetes Prediction using DL #285

Closed tarunvyshnav777 closed 5 months ago

tarunvyshnav777 commented 1 year ago

Deep Learning Simplified Repository (Proposing new issue)

:red_circle: Project Title : Diabetes Prediction
:red_circle: Aim : To create a DL model that predicts whether a person has diabetes or not.
:red_circle: Dataset : https://www.kaggle.com/datasets/mathchi/diabetes-data-set
:red_circle: Approach : Try to use 3-4 algorithms to implement the models and compare all the algorithms to find out the best fitted algorithm for the model by checking the accuracy scores. Also do not forget to do a exploratory data analysis before creating any model.


📍 Follow the Guidelines to Contribute in the Project :


:red_circle::yellow_circle: Points to Note :


:white_check_mark: To be Mentioned while taking the issue :


Happy Contributing 🚀

All the best. Enjoy your open source journey ahead. 😎

abhisheks008 commented 1 year ago

What are the deep learning methods you want to implement here? @tarunvyshnav777

vanya-anya commented 1 year ago

Full name : Ananya Sen GitHub Profile Link : https://github.com/vanya-anya Email ID : [senanya3014@gmail.com] Participant ID (if applicable): Approach for this Project : Prepare the labeled dataset and split it into training and testing sets. Extract or engineer relevant features from the data. Instantiate an SVM classifier with the chosen kernel function and hyperparameters. Train the SVM classifier using the training data to find the optimal hyperplane. Evaluate the model's performance on the testing set and use it for making predictions on new data. What is your participant role? SSOC' 2023

tarunvyshnav777 commented 1 year ago

What are the deep learning methods you want to implement here? @tarunvyshnav777

@abhisheks008 I'm confused here, we only need to work with dl models? But SVM is the best choice because of nature of dataset https://www.kaggle.com/datasets/mathchi/diabetes-data-set if SVM is fine please assign this under SSOC'23

riiyaa24 commented 1 year ago

Full name : Riya Parag Dhanduke GitHub Profile Link : https://github.com/riiyaa24 Email ID : riya2405.rana@gmail.com Participant ID (if applicable): Approach for this Project : 1.Prepare the labeled dataset and split it into training and testing sets. 2.Extract or engineer relevant features from the data. 3.Instantiate an SVM classifier with the chosen kernel function and hyperparameters. 4.Train the SVM classifier using the training data to find the optimal hyperplane. 5.Evaluate the model's performance on the testing set and use it for making predictions on new data.

What is your participant role? SSOC' 2023 A contributor in the development of open source projects

chandrima200 commented 1 year ago

Full name : Chandrima Paul GitHub Profile Link : https://github.com/chandrima200 Email ID : ichandrimapaul@gmail.com Participant ID (if applicable): Approach for this Project : Prepare the labeled dataset and split it into training and testing sets. Extract or engineer relevant features from the data. Instantiate an SVM classifier with the chosen kernel function and hyperparameters. Train the SVM classifier using the training data to find the optimal hyperplane. Evaluate the model's performance on the testing set and use it for making predictions on new data. What is your participant role? SSOC' 2023

sarmistha-02 commented 1 year ago
  1. Prepare the labeled dataset and split it into training and testing sets.
  2. Extract or engineer relevant features from the data.
  3. Instantiate an SVM classifier with the chosen kernel function and hyperparameters.
  4. Train the SVM classifier using the training data to find the optimal hyperplane.
  5. Evaluate the model's performance on the testing set and use it for making predictions on new data.
abhisheks008 commented 1 year ago

What are the deep learning methods you want to implement here? @tarunvyshnav777

@abhisheks008 I'm confused here, we only need to work with dl models? But SVM is the best choice because of nature of dataset https://www.kaggle.com/datasets/mathchi/diabetes-data-set if SVM is fine please assign this under SSOC'23

@tarunvyshnav777 try to find out such a dataset which is compatible with deep learning methods.

OWAIS-THEGREAT commented 1 year ago

Full name : Mohammed Owais GitHub Profile Link : https://github.com/OWAIS-THEGREAT Email ID : mohammedowais6361@gmail.com Participant ID (if applicable): Approach for this Project : First I will analyze the datasets in the link. Then I will remove those pictures that are not suitable for processing. Then I will preprocess the data to make it suitable for the Models. In my opinion, the best approach is to apply ANN to this dataset. Then I will use different parameters to get the best results. Will provide the most accurate model for this repo. I think I have the required amount of knowledge for this issue. What is your participant role? SSOC' 2023

abhisheks008 commented 1 year ago

Full name : Mohammed Owais GitHub Profile Link : https://github.com/OWAIS-THEGREAT Email ID : mohammedowais6361@gmail.com Participant ID (if applicable): Approach for this Project : First I will analyze the datasets in the link. Then I will remove those pictures that are not suitable for processing. Then I will preprocess the data to make it suitable for the Models. In my opinion, the best approach is to apply ANN to this dataset. Then I will use different parameters to get the best results. Will provide the most accurate model for this repo. I think I have the required amount of knowledge for this issue. What is your participant role? SSOC' 2023

The mentioned dataset does not contain any image files. The dataset is suitable for implementing ML models not the Image processing methods.

OWAIS-THEGREAT commented 1 year ago

Full name : Mohammed Owais GitHub Profile Link : https://github.com/OWAIS-THEGREAT Email ID : mohammedowais6361@gmail.com Participant ID (if applicable): Approach for this Project : First I will analyze the datasets in the link.

Then I will preprocess the data to make it suitable for the Models. In my opinion, the best approach is to apply ANN and will also try different models to this dataset. Then I will use different parameters to get the best results. Will provide the most accurate model for this repo. I think I have the required amount of knowledge for this issue. What is your participant role? SSOC' 2023

tarunvyshnav777 commented 1 year ago

What are the deep learning methods you want to implement here? @tarunvyshnav777

@abhisheks008 I'm confused here, we only need to work with dl models? But SVM is the best choice because of nature of dataset https://www.kaggle.com/datasets/mathchi/diabetes-data-set if SVM is fine please assign this under SSOC'23

@tarunvyshnav777 try to find out such a dataset which is compatible with deep learning methods.

@abhisheks008 The dataset: https://www.kaggle.com/datasets/iammustafatz/diabetes-prediction-dataset Approach for this Project:

  1. Extract or engineer relevant features from the data and perform EDA on it.
  2. Preprocess the data, handle missing values, and split it into training and testing sets.
  3. Use a Sequential model in TensorFlow/Keras.
  4. Implement 2-3 algorithms such as MLP, CNN, and RNN.
  5. Compare the algorithms based on accuracy scores to determine the best-fitted model.

@abhisheks008 Assign this under SSOC'23.

abhisheks008 commented 1 year ago

Go ahead @tarunvyshnav777. Issue assigned to you.

aindree-2005 commented 11 months ago

Full name : Aindree Chatterjee GitHub Profile Link : https://github.com/aindree-2005 Email ID : aindree2005@gmail.com Participant ID (if applicable): CodePeak 2023 Approach for this Project : SVC vs ANN comparison

OnePunchMonk commented 11 months ago

Full name : Avaya Aggarwal GitHub Profile Link : https://github.com/OnePunchMonk Email ID : aggarwal.avaya27@gmail.com Participant ID (if applicable): CodePeak 2023 Approach for this Project : PyCaret for AutoML, Ensembling

YashSachan2 commented 11 months ago

Hi @abhisheks008 please assign this issue to me .I have worked on a similar problem to predict whether a person is healthy or not using the same approach you mentioned.

KunalSharmaGit commented 10 months ago

Full Name: Kunal Sharma GitHub Profile Link: https://github.com/KunalSharmaGit Email ID: kunalsharma8630@gmail.com Approach For this Project: Preprocess the data and split it into training, validation and test sets. Design and train deep learning models such as a feedforward neural network, a convolutional neural network (CNN) and a recurrent neural network (RNN). Compare their accuracy scores on the test set to identify the most effective model for diabetes prediction. What is your Participant Role? SWOC'23

abhisheks008 commented 10 months ago

Cool @KunalSharmaGit issue assigned to you. You can start working on it.

Suggestion: Try to use ResNet or BERT for this project, look for the results/outputs it shows.

KunalSharmaGit commented 10 months ago

@abhisheks008 Thanks for assigning this.

sayanta28 commented 6 months ago

Full Name: Sayanta Chowdhury GitHub Profile Link: https://github.com/sayanta28 Email ID: sayanta28@gmail.com Approach For this Project: I will try to split the dataset into training, validation and test sets. Then I will design and train the dataset using DL models such as LSTM, CNN and RNN. I will compare the accuracy scores of those models on the test sets to identify the most effective model for diabetes prediction. What is your Participant Role? GSSoC'24

Gaurav-576 commented 6 months ago

Full name : Gaurav Kumar Singh GitHub Profile Link : https://github.com/gaurav-576 Email ID : gauravsingh96753@gmail.com Participant ID (if applicable): Approach for this Project : I would like to prepare the labeled dataset and split it into training and testing sets.

Then I would be using an Artificial Neural Network to fit on the particular dataset while maintaining that there is no overfitting. There is one more thing I would like to add about confusion matrix. Since its a diabetes prediction model, I would be focusing more on the fact that the false negative is minimised by focusing on recall while maintaining a good accuracy of the overall model. This means that the model should not predict a person to be non-diabetic if he is diabetic which may cause trouble in this prediction and real-life scenario. Evaluate the model's performance on the testing set and use it for making predictions on new data. What is your participant role? GSSoC'24

abhisheks008 commented 6 months ago

Hi @Gaurav-576 can you specify the algorithms you are going to use here for this project?

Gaurav-576 commented 6 months ago

The diabetes prediction model is a binary classification type problem so the machine learning algorithms which I would like to try out would be Logistic Regression, Support Vector Machines(SVM), Random Forest and if these algorithms are not fitting the problem properly and not giving me a very good accuracy then I would try out the k-nearest neighbours(k-NN). The most valid approach to solve this problem using deep learning would be building an artificial neural network using different types of activation functions and for the optimizers I might lose Stochastic Gradient Descent or Adam optimizers which would fit the data properly and give me a high accuracy for both test and training data would be an appropriate algorithm for this project.

abhisheks008 commented 6 months ago

Hi @Gaurav-576 one issue at a time.

Aryanmartinian commented 6 months ago

Hi I would like to contribute to this issue under GSSOC 24 as a contributor please assign me this issue

abhisheks008 commented 6 months ago

Hi @Aryanmartinian can you please comment out as per the issue template?

Aryanmartinian commented 6 months ago

Full Name - Aryan Mishra Github Profile Link - https://github.com/Aryanmartinian Email ID - aryan.martinian@gmail.com Approach for this Project - I would like to prepare the dataset and do the EDA on it in first task and then divide it into training and testing sets. Then I would be using an ANN to fit on the dataset and use the optimizers to optimize the preformance of the model and then evaluate the performance of the model on the selected metrics.

Aryanmartinian commented 6 months ago

Please assign me this issue

abhisheks008 commented 6 months ago

Hi @Aryanmartinian you need to be specific with the approach. You have to mention a detailed approach before taking an issue, it's obvious Neural networks will gonna be use here. You need to be specific about the models/algortihms you are planning to use here.

saikrishna823 commented 6 months ago

hi, @abhisheks008 ,could you please assign issue to me.I have experience of working with Machine Learning and Deep Learning. Full Name:Mule Sai Krishna Reddy Github Profile Link:https://github.com/saikrishna823 Email ID:20131a05f4@gvpce.ac.in Participant ID (if applicable): Approach for this Project : Since It is a binary classification probelm I will use following algorithm like Logistic Regression,SVM,Random Forest, Decision Trees and XGboost.I will compare accuracies of all models to find best model.I build ANN model too by modifying different activation functions to get better accuracy.I will also use TabNet which is pre trained model.For user interaction I will also create web interface using streamlit or flask.Please assign me this issue with proper level tag.Looking forward to contributing to this issue. What is your participant role? I am participating as contributor through gssoc' 24.

abhisheks008 commented 6 months ago

Hi @saikrishna823 as this project repo solely demands deep learning projects, please sure your approach should have the same thing instead of simple machine learning methods. You can share your enhanced/updated approach for this project.

Sgvkamalakar commented 6 months ago

@abhisheks008 I'll submit a pull request, following the guidelines within the next few days. Would you like to assign the issue #285 to me 🙏🏻

abhisheks008 commented 6 months ago

Looks good to me @Sgvkamalakar. Issue assigned to you.

Sgvkamalakar commented 6 months ago

Thanks @abhisheks008 .. Soon I will raise a PR ✌🏼