Invoke Google PaLM2 models

lm-sys / FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Apache License 2.0

35.82k stars 4.4k forks source link

Invoke Google PaLM2 models #3087

Closed sekhar-hari closed 4 months ago

sekhar-hari commented 4 months ago

Hi -

I have GCP account and have access to Google's PaLM2 models (text-bison and chat-bison). Is it possible to call these models through FastChat? Is there an example that someone can share?

Many thanks, Sekhar H.

infwinston commented 4 months ago

we've deprecated palm2. you may use gemini instead. https://github.com/lm-sys/FastChat/blob/main/docs/model_support.md#api-based-models

sekhar-hari commented 4 months ago

Ok, thank you for that information. I'm still on an older FastChat version that has PaLM2. PaLM2 is still popular. My requirement is to use PaLM2 based model though for a specific healthcare use case. It will be one of these models that I'm evaluating: (text-bison or chat-bison or MedPaLM2). How do I implement these models in api_provider.py? Is the template same as Gemini, then I can try and replicate. Let me know.

infwinston commented 4 months ago

it has different interface so might need some code change. here's the doc that might be helpful https://ai.google.dev/docs/migration_guide

sekhar-hari commented 4 months ago

Ok, understood. Do you know the last FastChat version that supports PaLM2? Also, what will be the API endpoint for non OpenAI API compatible models such as Google / Vertex AI models, Claude, NVIDIA etc.

sekhar-hari commented 4 months ago

I mean for OpenAI API compatible models, I can just call "http://localhost:8000/v1" after deploying FastChat. Is there a similar endpoint for Google models?