charann29 / opensource

105 stars 225 forks source link

ML Model Datasets Using Streamlits

This repository contains my machine learning models implementation code using streamlit in the Python programming language.

Website Image

Website Image


Mode of Execution Used PyCharm Streamlit

PyCharm and Streamlit Overview

PyCharm

Official Website

Visit the official website of PyCharm: JetBrains PyCharm

Download

Download PyCharm according to your platform:

Versions

  1. Community Version

    • Free and open-source.
    • Available for download at the end of the PyCharm website.
    • Setup via the setup wizard.
  2. Professional Version

    • Available at the top of the PyCharm website.
    • Download and follow setup instructions.
    • Choose between free trial or paid version.

Using PyCharm


Streamlit Server

Overview

Installation

To install Streamlit, run the following command in your terminal:

pip install streamlit

Usage


Running Project in Streamlit Server

Make Sure all dependencies are already satisfied before running the app.

  1. We can Directly run streamlit app with the following command-
    streamlit run app.py

    where app.py is the name of file containing streamlit code.

By default, streamlit will run on port 8501.

Also we can execute multiple files simultaneously and it will be executed in next ports like 8502 and so on.

  1. Navigate to URL http://localhost:8501

You should be able to view the homepage of your app.

🌟 Project and Models will change but this process will remain the same for all Streamlit projects.

Deploying using Streamlit

  1. Visit the official website of streamlit : Streamlit

  2. Now make an account with GitHub.

  3. Now add all the code in Github repository.

  4. Go to streamlit and there is an option for new deployment.

  5. Type your Github repository name and specify the file name. If you name your file as streamlit_app it will directly access it else you have to specify the path.

  6. Now also make sure you upload all your libraries and requirement name in a requirement.txt file.

  7. Version can also be mentioned like this python==3.9.

  8. When we mention version in the requirement file streamlit install all dependencies from there.

  9. If everything went well our app will be deployed on web and you can share the link and access the app from all browsers.

About Project


Algorithm Used

Supervised Learning

i) K-Nearest Neighbors (KNN)

ii) Support Vector Machines (SVM)

iii) Naive Bayes Classifiers

iv) Decision Tree

v) Random Forest

vi) Linear Regression

vii) Logistic Regression


Dataset Used

Iris Dataset

Breast Cancer Dataset

Wine Dataset

Digits Dataset

Diabetes Dataset

Naive Bayes Classification Data

Cars Evaluation Dataset

Salary Dataset


Libraries Used 📚 💻

Below is a short description of all the libraries used:

To install a Python library, use the following command:

pip install library_name

Resources

This resource provides an in-depth explanation of the differences between classification and regression tasks in machine learning, detailing their purposes, methods, and use cases.

Code Imports

Importing Numpy

import numpy as np

Why? Numpy is fundamental for numerical computations, providing support for arrays and matrices, and mathematical functions.

To read csv file

import pandas as pd

Why? Pandas is used for data manipulation and analysis, handling numerical tables and time series, and reading various file formats.

Importing datasets from sklearn

from sklearn import datasets

Why? Scikit-learn provides built-in datasets useful for practicing machine learning algorithms.

For splitting between training and testing

from sklearn.model_selection import train_test_split

Why? This module splits datasets into training and testing sets, crucial for evaluating model performance.

Importing Algorithm for Support Vector Machine

from sklearn.svm import SVC, SVR

Why? SVC and SVR are Support Vector Machine implementations for classification and regression tasks.

Importing K-nearest neighbors algorithm

from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor

Why? KNeighborsClassifier and KNeighborsRegressor are implementations of k-nearest neighbors for classification and regression tasks.

Importing Decision Tree algorithm

from sklearn.tree import DecisionTreeClassifier, DecisionTreeRegressor

Why? DecisionTreeClassifier and DecisionTreeRegressor are implementations of decision tree algorithms.

Importing Random Forest Classifer

from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor

Why? RandomForestClassifier and RandomForestRegressor are ensemble methods based on decision trees for classification and regression tasks.

Importing Naive Bayes algorithm

from sklearn.naive_bayes import GaussianNB

Why? GaussianNB is a Naive Bayes classifier implementation suitable for classification tasks.

Importing Linear and Logistic Regression

from sklearn.linear_model import LinearRegression, LogisticRegression

Why? LinearRegression models linear relationships between variables. LogisticRegression is for binary classification tasks.

Importing accuracy score and mean_squared_error

from sklearn.metrics import mean_squared_error, accuracy_score, mean_absolute_error

Why? These metrics evaluate model performance: mean_squared_error for regression, accuracy_score for classification accuracy, and mean_absolute_error for regression.

Importing PCA for dimension reduction

from sklearn.decomposition import PCA

Why? PCA reduces data dimensionality by projecting it onto a lower-dimensional space.

For Plotting

import matplotlib.pyplot as plt
import seaborn as sns

Why? Matplotlib and Seaborn are used for visualizing data, exploring patterns, and presenting results.

For model deployment

import streamlit as st

Why? Streamlit is a framework for building web applications for machine learning and data science, used here for deploying models and creating interactive data applications.

Importing Label Encoder for converting string to int

from sklearn.preprocessing import LabelEncoder

Why? LabelEncoder converts categorical labels to numerical labels, necessary for many machine learning algorithms.