k8sgpt-ai / k8sgpt

Giving Kubernetes Superpowers to everyone
http://k8sgpt.ai
Apache License 2.0
5.56k stars 642 forks source link

[Feature]: backend support for Hugging Face #828

Closed JuHyung-Son closed 7 months ago

JuHyung-Son commented 8 months ago

Checklist

Is this feature request related to a problem?

None

Problem Description

No response

Solution Description

backend support for HF models. users can use inference API of HF conversational model.

for HF interface, there are some packages like https://pkg.go.dev/github.com/hupe1980/go-huggingface

Benefits

using hf backend, k8sgpt users can use free llm api through hf. also, sLLMs are good enought on k8s analyzing.

Potential Drawbacks

  1. Inference API on HF is not for production. it is kind of serverless. so sometimes api responses like huggingfaces error: Model mistralai/Mistral-7B-v0.1 is currently loading. So hf backend should be used to locally or when testing llm model.
  2. not all models in hf are available. some models` inference API is inactivated, and even some are not work even in HF page. (below screenshot) Image

Additional Information

No response

JuHyung-Son commented 8 months ago

for go huggingface client packages (https://github.com/hupe1980/go-huggingface, https://github.com/Kardbord/hfapigo), they both are http request wrapper for HF inference API. which means it’s go version of this python code. since what we need is just conversation API, and their community is not strong, I'm not sure about using these packages.

import requests

API_URL = "https://api-inference.huggingface.co/models/microsoft/DialoGPT-medium"
headers = {"Authorization": "Bearer ###"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": {
        "past_user_inputs": ["Which movie is the best ?"],
        "generated_responses": ["It is Die Hard for sure."],
        "text": "Can you explain why ?"
    },
})