Sepsis-Classification-with-FastAPI

This project is focused on the accurate and efficient classification of sepsis cases using the FastAPI framework. Sepsis is a critical medical condition that requires prompt identification and treatment.

Streamlit input

This project aims to provide a streamlined solution for healthcare professionals to classify sepsis cases quickly and effectively.

Project Overview
Getting Started
Data
Modeling
Evaluation
Deployment
Future Work
Contact

Project Overview

The "Sepsis Classification with FastAPI" project aims to develop an accurate and efficient classification system for sepsis cases using the FastAPI framework. Sepsis is a life-threatening condition that requires immediate medical attention. This project addresses the critical need for timely identification and classification of sepsis cases to facilitate prompt treatment and improve patient outcomes.

The objectives of the project are as follows:

Train a machine learning model on a diverse dataset of sepsis cases to accurately predict the likelihood of sepsis in patients.
Utilize the FastAPI framework to create a user-friendly and efficient web interface for healthcare professionals to interact with the sepsis classification model.
Improve diagnostic capabilities by achieving high accuracy, sensitivity, and specificity in sepsis classification.
Provide a comprehensive and scalable solution that can be easily deployed in real-time healthcare environments.

Key challenges in this project include acquiring and preprocessing a reliable sepsis dataset, selecting an appropriate machine learning algorithm, optimizing the model's performance, and deploying the system in a secure and efficient manner.

Summary

Code	Name	Published Article	Deployed App	Streamlit App
LP6	Sepsis Prediction App with FastAPI and Streamlit	Medium Article	FastAPI App	Streamlit App

Project Setup

To set up the project environment, follow these steps:

Clone the repository:

git clone my_github

https://github.com/aliduabubakari/Sepsis-Classification-with-FastAPI.git

Install the required dependencies:

pip install -r requirements.txt

Create a virtual environment:

Windows:

python -m venv venv
venv\Scripts\activate

Linux & MacOS:

python3 -m venv venv
source venv/bin/activate

You can copy each command above and run them in your terminal to easily set up the project environment.

Data

The data used in this project consists of a diverse collection of sepsis cases obtained from Sepsis.

Data Fields

Column Name	Data Features	Description
ID	N/A	Unique number to represent patient ID
PRG	Attribute 1	Plasma glucose
PL	Attribute 2	Blood Work Result-1 (mu U/ml)
PR	Attribute 3	Blood Pressure (mm Hg)
SK	Attribute 4	Blood Work Result-2 (mm)
TS	Attribute 5	Blood Work Result-3 (mu U/ml)
M11	Attribute 6	Body mass index (weight in kg/(height in m)^2)
BD2	Attribute 7	Blood Work Result-4 (mu U/ml)
Age	Attribute 8	Patient's age (years)
Insurance	N/A	If a patient holds a valid insurance card
Sepsis	Target	Positive: if a patient in ICU will develop sepsis, Negative: otherwise

Exploratory Data Analysis

During the exploratory data analysis (EDA) phase, a comprehensive investigation of the sepsis dataset was conducted to gain insights through various types of analyses.

Univariate analysis: A thorough examination of each variable individually was performed. Summary statistics such as mean, median, standard deviation, and quartiles were calculated to understand the central tendency and spread of the data.

Univariate

Bivariate analysis: Relationships between pairs of variables were explored to identify patterns and potential predictor variables for sepsis classification.

Bivariate

Multivariate analysis: Relationships among multiple variables were examined simultaneously, allowing for a deeper understanding of their interactions and impact on sepsis.

multivariate

In addition to these exploratory analyses, hypotheses were formulated based on prior knowledge and existing research. Statistical tests such as t-tests, chi-square tests, or ANOVA tests were utilized to test these hypotheses, depending on the nature of the variables. The results of these tests validated or refuted the formulated hypotheses and provided further insights into the relationships between variables.

Hypotheses:

hypothesis

Hypothesis 1: Higher plasma glucose levels (PRG) are associated with an increased risk of developing sepsis.
Hypothesis 2: Abnormal blood work results, such as high values of PL, SK, and BD2, are indicative of a higher likelihood of sepsis.
Hypothesis 3: Older patients are more likely to develop sepsis compared to younger patients.
Hypothesis 4: Patients with higher body mass index (BMI) values (M11) have a lower risk of sepsis.
Hypothesis 5: Patients without valid insurance cards are more likely to develop sepsis.

These hypotheses, along with the results of the EDA, contribute to a deeper understanding of the dataset and provide valuable insights for further analysis and model development.

Modeling

hypothesis

During the modeling phase, the evaluation of models took into consideration the imbalanced nature of the data. The main metrics used to assess model performance were the F1 score and AUC score, which provide a balanced assessment for imbalanced datasets.

The following models were evaluated:

Decision Tree:
Logistic Regression:
Naive Bayes:
Stochastic Gradient Descent:
Random Forest:
XGBoost:

These models were evaluated based on their F1 and AUC scores, providing insights into their performance on the imbalanced dataset. Below is the results;

Model comparison

Evaluation

hypothesis

Given the imbalanced nature of the data, the models' performance was assessed using the F1 score, which considers both precision and recall, providing a balanced measure of accuracy. Additionally, the AUC score was considered to evaluate the models' ability to distinguish between positive and negative cases.

results

Hyperparameter tuning was also implemented to optimize the performance of the models. By fine-tuning the hyperparameters, it was possible to identify the best combination of parameter values that yielded the highest performance for each model.

Deployment

Fastapi deployment

FastAPI

Make sure you have FastAPI and any necessary dependencies installed. You can install FastAPI using pip:

pip install fastapi

Open a terminal or command prompt and navigate to the directory where your main.py file is located.
Run the FastAPI application using the uvicorn command, specifying the module and application name:

uvicorn main:app --reload

After running the command, you should see output indicating that the FastAPI application is running and listening on a specific address (e.g., http:localhost:8000). This address represents the API endpoint where you can access your application.
Open a web browser or use an API testing tool (e.g., Postman) to interact with your deployed FastAPI application. Use the API endpoint provided in the terminal to make requests and receive responses.

API Documentation

The API documentation provides details about the available endpoints, request and response formats, and example usage. You can access the documentation by visiting the /docs endpoint after starting the server (http://localhost:8000/docs).

FastAPI

Containerized deployment

To run the Docker container based on the provided Dockerfile, follow these steps:

Make sure you have Docker installed on your system.
Create a new file named Dockerfile (without any file extension) in the root directory of your project.
Copy the content of the Dockerfile you provided into the newly created Dockerfile.
Open a terminal or command prompt and navigate to the directory where the Dockerfile is located.
Build the Docker image by running the following command:

docker build -t your-image-name .

Replace your-image-name with the desired name for your Docker image. The . at the end denotes the current directory as the build context.
Once the image is built, you can run a Docker container based on that image using the following command:

docker run -d -p host-port:container-port your-image-name

Replace host-port with the port number on your host machine that you want to map to the container's port, and replace container-port with the port number specified in the Dockerfile's EXPOSE instruction (in this case, it's 8000).

For example, if you want to map the container's port 8000 to port 8080 on your host machine, the command would be:

docker run -d -p 8080:8000 your-image-name

After running the command, the Docker container will start, and your FastAPI application will be running inside the container.

Desktop Docker

You can access your application by visiting http://localhost:host-port in your web browser or using an API testing tool.

For example, if you mapped the container's port 8000 to your host's port 8080, you would access the application at http://localhost:8080.

Streamlit deployment

Navigate to the cloned repository and run the command:

pip install -r requirements.txt

To run the demo app (being at the repository root), use the following command:

streamlit run streamlit_app.py

App Execution on Huggingface

Here's a step-by-step process on how to use the Streamlit App and API Access on Huggingface:

Streamlit input

streamlit input

streamlit results

Future Work

sepsis solution recommendation

For future work, incorporating clustering algorithms can be a valuable addition to sepsis identification and classification. Clustering algorithms can help in grouping similar patient data together based on patterns and similarities.

Contact

Alidu Abubakari

Data Analyst Azubi Africa

aliduabubakari / Sepsis-Classification-with-FastAPI

readme

Sepsis-Classification-with-FastAPI

Table of Contents

Project Overview

Summary

Project Setup

Data

Data Fields

Exploratory Data Analysis

Hypotheses:

Modeling

Evaluation

Deployment

Fastapi deployment

API Documentation

Containerized deployment

Streamlit deployment

App Execution on Huggingface

Future Work

Contact