digai-co / DataGPT

AI Agent for Natural Language Queries to Generate Charts from a Database.
MIT License
53 stars 10 forks source link

DataGPT

DataGPT is a compact yet complete AI Agent that allows users to input queries in natural language. It attempts to understand the user's intent, retrieves relevant data from a database, generates corresponding charts, and provides feedback to the user through a web interface. The goal of this project is to make data querying and visualization more straightforward and intuitive.

Architecture

Design Principles

Key Features

TODO

Installation and Configuration

  1. Clone the repository to your local machine:

    git clone https://github.com/digai-co/DataGPT.git
    cd DataGPT
  2. Create and activate a virtual environment (optional but recommended):

    python -m venv venv
    source venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Using docker to quick create db and import sample data (optional):

    # postgresql
    sh examples/db/run_pg.sh
    # mysql
    sh examples/db/run_mysql.sh
  5. Edit config file based on your openai and database information:

    llm:
     openai:
       api_base: # https://api.openai.com/v1 or your azure api base
       api_key: # your api key
       api_model: # gpt-3.5-turbo
    
    database:
      type: postgresql # support postgresql and mysql now
      uri: postgresql://postgres:@localhost:5432/datagpt # database uri
    
    memory:
      dir: data  # long-term memory directory
      schema_file: schema # schema cache file
  6. Run the application:

    python server/app.py
  7. Access the web interface: Open a web browser and visit url to use DataGPT.

Usage Example

  1. Open the web interface and input your query request.

  2. DataGPT will parse your request, extract relevant data from the database, and generate corresponding charts.

  3. View the generated charts, and you can download them to share with others.

Troubleshooting

Issue: What should I do if the schema information in memory becomes outdated?

Solution: If you find that the schema information in memory has become outdated, you can take the following steps to update it:

  1. Delete the cache files in the data subdirectory located in the project's root directory. You can remove the cache files using the following command:
    rm -rf data/*
  2. Restart the application to ensure it uses the latest schema information:
    python server/app.py

Contributing

If you wish to contribute to the project, you can:

License

This project is licensed under the MIT License.

Author

DataGPT is created and maintained by DigAI.co TEAM, including Daoguo Dong, Tao Wang, and more...