zhudotexe / kani

kani (カニ) is a highly hackable microframework for chat-based language models with tool use/function calling. (NLP-OSS @ EMNLP 2023)
MIT License
556 stars 30 forks source link
chatgpt claude-2 framework function-calling gpt-3 gpt-4 large-language-models llama llama-2 llms microframework openai tool-use


Test Package Documentation Status PyPI Quickstart in Colab Discord
Model zoo Retrieval example

kani (カニ)

kani (カニ) is a lightweight and highly hackable framework for chat-based language models with tool usage/function calling.

Compared to other LM frameworks, kani is less opinionated and offers more fine-grained customizability over the parts of the control flow that matter, making it the perfect choice for NLP researchers, hobbyists, and developers alike.

kani comes with support for the following models out of the box, with a model-agnostic framework to add support for many more:

Hosted Models

Open Source Models

kani supports every chat model available on Hugging Face through transformers or llama.cpp!

In particular, we have reference implementations for the following base models, and their fine-tunes:

Check out the Model Zoo to see how to use each of these models in your application!

Interested in contributing? Check out our guide.

Read the docs on ReadTheDocs!

Read our paper on arXiv!



kani requires Python 3.10 or above. To install model-specific dependencies, kani uses various extras (brackets after the library name in pip install). To determine which extra(s) to install, see the model table, or use the [all] extra to install everything.

# for OpenAI models
$ pip install "kani[openai]"
# for Hugging Face models
$ pip install "kani[huggingface]" torch
# or install everything:
$ pip install "kani[all]"

For the most up-to-date changes and new models, you can also install the development version from Git's main branch:

$ pip install "kani[all] @ git+https://github.com/zhudotexe/kani.git@main"


Quickstart in Colab

kani requires Python 3.10 or above.

First, install the library. In this quickstart, we'll use the OpenAI engine, though kani is model-agnostic.

$ pip install "kani[openai]"

Then, let's use kani to create a simple chatbot using ChatGPT as a backend.

# import the library
import asyncio
from kani import Kani, chat_in_terminal
from kani.engines.openai import OpenAIEngine

# Replace this with your OpenAI API key: https://platform.openai.com/account/api-keys
api_key = "sk-..."

# kani uses an Engine to interact with the language model. You can specify other model 
# parameters here, like temperature=0.7.
engine = OpenAIEngine(api_key, model="gpt-3.5-turbo")

# The kani manages the chat state, prompting, and function calling. Here, we only give 
# it the engine to call ChatGPT, but you can specify other parameters like 
# system_prompt="You are..." here.
ai = Kani(engine)

# kani comes with a utility to interact with a kani through your terminal...

# or you can use kani programmatically in an async function!
async def main():
    resp = await ai.chat_round("What is the airspeed velocity of an unladen swallow?")


kani makes the time to set up a working chat model short, while offering the programmer deep customizability over every prompt, function call, and even the underlying language model.

Function Calling

Function calling gives language models the ability to choose when to call a function you provide based off its documentation.

With kani, you can write functions in Python and expose them to the model with just one line of code: the @ai_function decorator.

# import the library
import asyncio
from typing import Annotated
from kani import AIParam, Kani, ai_function, chat_in_terminal, ChatRole
from kani.engines.openai import OpenAIEngine

# set up the engine as above
api_key = "sk-..."
engine = OpenAIEngine(api_key, model="gpt-3.5-turbo")

# subclass Kani to add AI functions
class MyKani(Kani):
    # Adding the annotation to a method exposes it to the AI
    def get_weather(
        # and you can provide extra documentation about specific parameters
        location: Annotated[str, AIParam(desc="The city and state, e.g. San Francisco, CA")],
        """Get the current weather in a given location."""
        # In this example, we mock the return, but you could call a real weather API
        return f"Weather in {location}: Sunny, 72 degrees fahrenheit."

ai = MyKani(engine)

# the terminal utility allows you to test function calls...

# and you can track multiple rounds programmatically.
async def main():
    async for msg in ai.full_round("What's the weather in Tokyo?"):
        print(msg.role, msg.text)


kani guarantees that function calls are valid by the time they reach your methods while allowing you to focus on writing code. For more information, check out the function calling docs.


kani supports streaming responses from the underlying language model token-by-token, even in the presence of function calls. Streaming is designed to be a drop-in superset of the chat_round and full_round methods, allowing you to gradually refactor your code without ever leaving it in a broken state.

async def stream_chat():
    stream = ai.chat_round_stream("What does kani mean?")
    async for token in stream:
        print(token, end="")
    msg = await stream.message()  # or `await stream`

async def stream_with_function_calling():
    async for stream in ai.full_round_stream("What's the weather in Tokyo?"):
        async for token in stream:
            print(token, end="")
        msg = await stream.message()

Why kani?

Existing frameworks for language models like LangChain and simpleaichat are opinionated and/or heavyweight - they edit developers' prompts under the hood, are challenging to learn, and are difficult to customize without adding a lot of high-maintenance bloat to your codebase.


We built kani as a more flexible, simple, and robust alternative. A good analogy between frameworks would be to say that kani is to LangChain as Flask (or FastAPI) is to Django.

kani is appropriate for everyone from academic researchers to industry professionals to hobbyists to use without worrying about under-the-hood hacks.


To learn more about how to customize kani with your own prompt wrappers, function calling, and more, read the docs!

Or take a look at the hands-on examples in this repo.


Want to see kani in action? Using 4-bit quantization to shrink the model, we run LLaMA v2 as part of our test suite right on GitHub Actions:


Simply click on the latest build to see LLaMA's output!

Who we are

University of Pennsylvania Logo

The core development team is made of three PhD students in the Department of Computer and Information Science at the University of Pennsylvania. We're all members of Prof. Chris Callison-Burch's lab, working towards advancing the future of NLP.

We use kani actively in our research, and aim to keep it up-to-date with modern NLP practices.


If you use Kani, please cite us as:

    title = "Kani: A Lightweight and Highly Hackable Framework for Building Language Model Applications",
    author = "Zhu, Andrew  and
      Dugan, Liam  and
      Hwang, Alyssa  and
      Callison-Burch, Chris",
    editor = "Tan, Liling  and
      Milajevs, Dmitrijs  and
      Chauhan, Geeticka  and
      Gwinnup, Jeremy  and
      Rippeth, Elijah",
    booktitle = "Proceedings of the 3rd Workshop for Natural Language Processing Open Source Software (NLP-OSS 2023)",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.nlposs-1.8",
    doi = "10.18653/v1/2023.nlposs-1.8",
    pages = "65--77",


We would like to thank the members of the lab of Chris Callison-Burch for their testing and detailed feedback on the contents of both our paper and the Kani repository. In addition, we’d like to thank Henry Zhu (no relation to the first author) for his early and enthusiastic support of the project.

This research is based upon work supported in part by the Air Force Research Laboratory (contract FA8750-23-C-0507), the IARPA HIATUS Program (contract 2022-22072200005), and the NSF (Award 1928631). Approved for Public Release, Distribution Unlimited. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of IARPA, NSF, or the U.S. Government.