Wilson-ZheLin / Streamline-Analyst

An AI agent powered by LLMs that streamlines the entire process of data analysis. 🚀
https://streamline.streamlit.app
MIT License
353 stars 45 forks source link
agent gpt-4 llms

Streamline Analyst: A Data Analysis AI Agent

Languages / 语言选择: English | 中文

Streamline Analyst 🪄 is a cutting-edge, open-source application powered by Large Language Models (LLMs) designed to revolutionize data analysis. This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.

Here's how it simplifies your workflow: just select your data file, pick an analysis mode, and hit start. Streamline Analyst aims to expedite the data analysis process, making it accessible to all, regardless of their expertise in data analysis. It's built to empower users to process data and achieve high-quality visualizations with unparalleled efficiency🚀, and to execute high-performance modeling with the best strategies🔮.

Try Our Live Demo Here: Streamline Analyst

When utilizing GPT-4 turbo, the cost for each comprehensive end-to-end API request is roughly $0.02.

Your data's privacy and security are paramount; rest assured, uploaded data and API Keys are strictly for one-time use and are neither saved nor shared.

Screenshot 2024-02-12 at 16 01 01

Looking ahead, we plan to enhance Streamline Analyst with advanced features like Natural Language Processing (NLP), neural networks, and object detection (utilizing YOLO), broadening its capabilities to meet more diverse data analysis needs.

Demo

https://github.com/Wilson-ZheLin/Streamline-Analyst/assets/145169519/1d30faca-f474-42fd-b20b-c93ed7cf6d13

Demo link available at: Streamline Analyst

Current Version Features

All processed data and models are made available for download, offering a comprehensive, user-friendly data analysis toolkit.

Modeling and Results Visualization:

Screenshot 2024-02-12 at 16 10 35

Automated Workflow Interface:

Screenshot 2024-02-12 at 16 20 19

Supported Modeling tasks:

Classification Models Clustering Models Regression Models
Logistic regression K-means clustering Linear regression
Random forest DBSCAN Ridge regression
Support vector machine Gaussian mixture model Lasso regression
Gradient boosting machine Hierarchical clustering Elastic net regression
Gaussian Naive Bayes Spectral clustering Random forest regression
AdaBoost etc. Gradient boosting regression
XGBoost etc.

Real-time calculation of model indicators and result visualization:

Classification Metrics & Plots Clustering Metrics & Plots Regression Metrics & Plots
Model score Silhouette score R-squared score
Confusion matrix Calinski-Harabasz score Mean square error (MSE)
AUC Davies-Bouldin score Root mean square error (RMSE)
F1 score Cluster scatter plot Absolute error (MAE)
ROC plot etc. Residual plot
etc. Predicted value vs actual value plot
Quantile-Quantile plot

Visual Analysis Toolkit:

Streamline Analyst 🪄 offers an array of intuitive visual tools for enhanced data insight, without the need for an API Key:

Local Installation

Prerequisites

To run app.py, you'll need:

Installation

  1. Install the required packages
pip install -r requirements.txt
  1. Run app.py on your local machine
streamlit run app.py