jeremychone / rust-genai

Rust multiprovider generative AI client (Ollama, OpenAi, Anthropic, Groq, Gemini, Cohere, ...)
Apache License 2.0
202 stars 45 forks source link

genai - Multi-AI Providers Library for Rust.

Currently supports natively: Ollama, OpenAI, Anthropic, groq, Gemini, Cohere (more to come)

Static Badge Static Badge
# cargo.toml
genai = "=0.1.10" # Version lock for `0.1.x`


The goal of this library is to provide a common and ergonomic single API to many generative AI Providers, such as OpenAI, Anthropic, Cohere, Ollama.

Examples | Thanks | Library Focus | Changelog | Provider Mapping: ChatOptions | MetaUsage

Examples

examples/c00-readme.rs

use genai::chat::printer::{print_chat_stream, PrintChatStreamOptions};
use genai::chat::{ChatMessage, ChatRequest};
use genai::Client;

const MODEL_OPENAI: &str = "gpt-4o-mini";
const MODEL_ANTHROPIC: &str = "claude-3-haiku-20240307";
const MODEL_COHERE: &str = "command-light";
const MODEL_GEMINI: &str = "gemini-1.5-flash-latest";
const MODEL_GROQ: &str = "gemma-7b-it";
const MODEL_OLLAMA: &str = "gemma:2b"; // sh: `ollama pull gemma:2b`

// NOTE: Those are the default environment keys for each AI Adapter Type.
//       Can be customized, see `examples/c02-auth.rs`
const MODEL_AND_KEY_ENV_NAME_LIST: &[(&str, &str)] = &[
    // -- de/activate models/providers
    (MODEL_OPENAI, "OPENAI_API_KEY"),
    (MODEL_ANTHROPIC, "ANTHROPIC_API_KEY"),
    (MODEL_COHERE, "COHERE_API_KEY"),
    (MODEL_GEMINI, "GEMINI_API_KEY"),
    (MODEL_GROQ, "GROQ_API_KEY"),
    (MODEL_OLLAMA, ""),
];

// NOTE: Model to AdapterKind (AI Provider) type mapping rule
//  - starts_with "gpt"      -> OpenAI
//  - starts_with "claude"   -> Anthropic
//  - starts_with "command"  -> Cohere
//  - starts_with "gemini"   -> Gemini
//  - model in Groq models   -> Groq
//  - For anything else      -> Ollama
//
// Can be customized, see `examples/c03-kind.rs`

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let question = "Why is the sky red?";

    let chat_req = ChatRequest::new(vec![
        // -- Messages (de/activate to see the differences)
        ChatMessage::system("Answer in one sentence"),
        ChatMessage::user(question),
    ]);

    let client = Client::default();

    let print_options = PrintChatStreamOptions::from_print_events(false);

    for (model, env_name) in MODEL_AND_KEY_ENV_NAME_LIST {
        // Skip if does not have the environment name set
        if !env_name.is_empty() && std::env::var(env_name).is_err() {
            println!("===== Skipping model: {model} (env var not set: {env_name})");
            continue;
        }

        let adapter_kind = client.resolve_model_iden(model)?.adapter_kind;

        println!("\n===== MODEL: {model} ({adapter_kind}) =====");

        println!("\n--- Question:\n{question}");

        println!("\n--- Answer:");
        let chat_res = client.exec_chat(model, chat_req.clone(), None).await?;
        println!("{}", chat_res.content_text_as_str().unwrap_or("NO ANSWER"));

        println!("\n--- Answer: (streaming)");
        let chat_res = client.exec_chat_stream(model, chat_req.clone(), None).await?;
        print_chat_stream(chat_res, Some(&print_options)).await?;

        println!();
    }

    Ok(())
}

More Examples


Static Badge

Thanks

Library Focus:

ChatOptions

Property OpenAI Anthropic Ollama Groq Gemini generationConfig. Cohere
temperature temperature temperature temperature temperature temperature temperature
max_tokens max_tokens max_tokens (default 1024) max_tokens max_tokens maxOutputTokens max_tokens
top_p top_p top_p top_p top_p topP p

MetaUsage

Property OpenAI
usage.
Ollama
usage.
Groq x_groq.usage. Anthropic usage. Gemini usageMetadata. Cohere meta.tokens.
input_tokens prompt_tokens prompt_tokens (1) prompt_tokens input_tokens (added) promptTokenCount (2) input_tokens
output_tokens completion_tokens completion_tokens (1) completion_tokens output_tokens (added) candidatesTokenCount (2) output_tokens
total_tokens total_tokens total_tokens (1) completion_tokens (computed) totalTokenCount (2) (computed)

Note (1): At this point, Ollama does not emit input/output tokens when streaming due to the Ollama OpenAI compatibility layer limitation. (see ollama #4448 - Streaming Chat Completion via OpenAI API should support stream option to include Usage)

Note (2) Right now, with Gemini Stream API, it's not really clear if the usage for each event is cumulative or needs to be added. Currently, it appears to be cumulative (i.e., the last message has the total amount of input, output, and total tokens), so that will be the assumption. See possible tweet answer for more info.

Notes on Possible Direction

Links