Open sjdthree opened 12 months ago
the Huggingface_hub integration via langchain might provide an alternate route. https://python.langchain.com/docs/modules/model_io/models/llms/integrations/huggingface_hub
Here is some typescript / nextjs code snippets:
Create a new file in the pages/api directory. huggingface.ts:
import { NextApiRequest, NextApiResponse } from 'next';
import axios from 'axios';
export default async function handler(req: NextApiRequest, res: NextApiResponse) {
if (req.method === 'POST') {
try {
const response = await axios.post(
"https://api-inference.huggingface.co/models/gpt2",
req.body,
{
headers: { Authorization: `Bearer ${YOUR_HUGGINGFACE_API_TOKEN}` },
}
);
res.status(200).json(response.data);
} catch (error) {
res.status(500).json({ error: 'Error calling Hugging Face API' });
}
} else {
res.status(405).json({ error: 'Only POST requests are accepted' });
}
}
Call this API route from your Next.js pages or components like this:
import axios from 'axios';
async function query(data: string) {
const response = await axios.post('/api/huggingface', data);
return response.data;
}
query("Can you please let us know more details about your ").then((response) => {
console.log(JSON.stringify(response));
});
@sjdthree I researched Huggingface and was able to get it up and running easily. However, to run Llama2 as an API on Huggingface, we need to host the model on our own account. It seems feasible to implement for developers, but there may be a cost involved to use it on the site. With Replicate(fixed period) or Llama API, it can be provided for free. What do you think?
https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI
@sjdthree I researched Huggingface and was able to get it up and running easily. However, to run Llama2 as an API on Huggingface, we need to host the model on our own account. It seems feasible to implement for developers, but there may be a cost involved to use it on the site. With Replicate(fixed period) or Llama API, it can be provided for free. What do you think?
https://huggingface.co/spaces/ysharma/Explore_llamav2_with_TGI
Yes, I like it!
I see Replicate.com as similar to Huggingface with a limited free tier then pay-for-speed / performance, etc. Is there a tier difference I'm missing?
I would strongly support options! so all three: hf, replicate and llama api.
How best to architect to handle these?
see Replicate.com as similar to Huggingface with a limited free tier then pay-for-speed / performance, etc. Is there a tier difference I'm missing?
I have no differences in understanding.
I would strongly support options! so all three: hf, replicate and llama api.
Let's support multiple options. On our demo page, we should enable trying Llama2 with Replicate.
How best to architect to handle these?
The LLM calls are made using LangChain, and both HF and Replicate are supported. The selected model from the UI is passed as modelName
in all calls. It seems like creating a function that returns an LLM instance based on the modelName
would be a good idea.
First, replacing just BabyElfAGI should be sufficient.
This sounds good.
Where would we change the drop-down on the front page?
Thank you for checking it right away. It's defined in /src/utils/constants.ts
.
https://github.com/miurla/babyagi-ui/blob/main/src/utils/constants.ts#L8-L20
The name
here is used as a display name. In the LLM invocation, the modelName
is passed the id
.
Ok sounds good. Did you want me to make the changes and post a PR for your review?
I'm glad to hear that! It would be extremely helpful! Can you please try it once? I’ll support you anytime.
In addition to openai, i would like to add the ability to call a model via huggingface inference API
This would allow the deployer to select from all the models on hf, including the new well-performing open source version of Llama, Llama2
It needs the huggingface api key, similar to openai.
Here is sample (untested) code using axios to fetch results from a "gpt2" model via huggingface API: