moncapitaine commented 1 year ago

and find other Rest API LLM providers

splendidcomputer commented 1 year ago

Here you can compare some interesting major Large Langauges Models (LLMs):

https://gpt.h2o.ai/

There is also a free hosted web API for the tiiuae/falcon-7b-instruct model.

Up to now, I have not seen a hosted Rest API for the Faclon-40b.

Would you please give these models a shot and see which one works better?

splendidcomputer commented 1 year ago

Creating a REST API for the Python project

Creating a REST API for the Python project will allow us to interact with our LLM through HTTP requests, making it accessible from various platforms and applications. To achieve this, we can use a web framework like Flask. Here's a step-by-step guide on how to create a simple REST API for your project:

Note: Here we are using Debian.

Install Flask: First, we need to install Flask. So, we open the terminal and run the following command:
```
pip install Flask
```
Create the API Server: Create a new Python file, let's call it app.py, in the same directory as your existing project. This file will be the entry point for your REST API.
Import Dependencies: In app.py, import the necessary modules:
```
from flask import Flask, request, jsonify
```
Initialize Flask: Create an instance of the Flask app:
```
app = Flask(__name__)
```

Define API Endpoint: Create an API endpoint that will handle incoming POST requests:

@app.route('/chat', methods=['POST'])
def chat():
   data = request.get_json()
   # Process the data using your existing chat interface code
   # Replace this with your chat interface code
   response = {'message': 'This is a sample response'}
   return jsonify(response), 200

Run the Server: Add the following lines at the end of app.py to run the server:
```
if __name__ == '__main__':
   app.run(host='0.0.0.0', port=5000)
```
Run the API: In our terminal, n we avigate to the directory containing app.py and run the following command to start the API server:
```
python app.py
```
Our API should now be running and accessible at http://127.0.0.1:5000/chat.
Test the API: We can use tools like curl or Postman to test our API by sending POST requests to http://127.0.0.1:5000/chat. We should make sure to send JSON data in the request body and expect a JSON response.

splendidcomputer commented 1 year ago

Running the Falcon-40B-Instruct model on Azure Kubernetes Service

To run the Falcon-40B-instruct model you need at least the SKU Standard_NC48ads_A100_v4 with a total of 160Gb of GPU Memory (2 x 80Gb).

ishaan-jaff commented 1 year ago

Hi @moncapitaine @splendidcomputer I believe we can help with this issue. I’m the maintainer of LiteLLM https://github.com/BerriAI/litellm

TLDR: We allow you to use any LLM as a drop in replacement for gpt-3.5-turbo. You can use our proxy server or spin up your own proxy server using LiteLLM

Usage

This calls the provider API directly

from litellm import completion
import os
## set ENV variables 
os.environ["OPENAI_API_KEY"] = "your-key" # 
messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# falcon call
response = completion(model="falcon-40b", messages=messages)

progwise / george-ai

Find Falcon 40B Rest API provider #15

Creating a REST API for the Python project

Usage