The Python Redis Vector Library (RedisVL) is a tailor-made client for AI applications leveraging Redis.
It's specifically designed for:
Enhance your applications with Redis' speed, flexibility, and reliability, incorporating capabilities like vector-based semantic search, full-text search, and geo-spatial search.
The emergence of the modern GenAI stack, including vector databases and LLMs, has become increasingly popular due to accelerated innovation & research in information retrieval, the ubiquity of tools & frameworks (e.g. LangChain, LlamaIndex, EmbedChain), and the never-ending stream of business problems addressable by AI.
However, organizations still struggle with delivering reliable solutions quickly (time to value) at scale (beyond a demo).
Redis has been a staple for over a decade in the NoSQL world, and boasts a number of flexible data structures and processing engines to handle realtime application workloads like caching, session management, and search. Most notably, Redis has been used as a vector database for RAG, as an LLM cache, and chat session memory store for conversational AI applications.
The vector library bridges the gap between the emerging AI-native developer ecosystem and the capabilities of Redis by providing a lightweight, elegant, and intuitive interface. Built on the back of the popular Python client, redis-py
, it abstracts the features Redis into a grammar that is more aligned to the needs of today's AI/ML Engineers or Data Scientists.
Install redisvl
into your Python (>=3.8) environment using pip
:
pip install redisvl
For more instructions, visit the
redisvl
installation guide.
Choose from multiple Redis deployment options:
docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
Enhance your experience and observability with the free Redis Insight GUI.
Design an IndexSchema
that models your dataset with built-in Redis data structures (Hash or JSON) and indexable fields (e.g. text, tags, numerics, geo, and vectors).
Load a schema from a YAML file:
index:
name: user-index-v1
prefix: user
storage_type: json
fields:
- name: user
type: tag
- name: credit_score
type: tag
- name: embedding
type: vector
attrs:
algorithm: flat
dims: 3
distance_metric: cosine
datatype: float32
from redisvl.schema import IndexSchema
schema = IndexSchema.from_yaml("schemas/schema.yaml")
Or load directly from a Python dictionary:
schema = IndexSchema.from_dict({
"index": {
"name": "user-index-v1",
"prefix": "user",
"storage_type": "json"
},
"fields": [
{"name": "user", "type": "tag"},
{"name": "credit_score", "type": "tag"},
{
"name": "embedding",
"type": "vector",
"attrs": {
"algorithm": "flat",
"datatype": "float32",
"dims": 4,
"distance_metric": "cosine"
}
}
]
})
Create a SearchIndex class with an input schema and client connection in order to perform admin and search operations on your index in Redis:
from redis import Redis
from redisvl.index import SearchIndex
# Establish Redis connection and define index
client = Redis.from_url("redis://localhost:6379")
index = SearchIndex(schema, client)
# Create the index in Redis
index.create()
Async compliant search index class also available:
AsyncSearchIndex
Load and fetch data to/from your Redis instance:
data = {"user": "john", "credit_score": "high", "embedding": [0.23, 0.49, -0.18, 0.95]}
# load list of dictionaries, specify the "id" field
index.load([data], id_field="user")
# fetch by "id"
john = index.fetch("john")
Define queries and perform advanced searches over your indices, including the combination of vectors, metadata filters, and more.
VectorQuery - Flexible vector queries with customizable filters enabling semantic search:
from redisvl.query import VectorQuery
query = VectorQuery(
vector=[0.16, -0.34, 0.98, 0.23],
vector_field_name="embedding",
num_results=3
)
# run the vector search query against the embedding field
results = index.query(query)
Incorporate complex metadata filters on your queries:
from redisvl.query.filter import Tag
# define a tag match filter
tag_filter = Tag("user") == "john"
# update query definition
query.set_filter(tag_filter)
# execute query
results = index.query(query)
RangeQuery - Vector search within a defined range paired with customizable filters
FilterQuery - Standard search using filters and the full-text search
CountQuery - Count the number of indexed records given attributes
Read more about building advanced Redis queries here.
Create, destroy, and manage Redis index configurations from a purpose-built CLI interface: rvl
.
$ rvl -h
usage: rvl <command> [<args>]
Commands:
index Index manipulation (create, delete, etc.)
version Obtain the version of RedisVL
stats Obtain statistics about an index
Read more about using the
redisvl
CLI here.
Integrate with popular embedding models and providers to greatly simplify the process of vectorizing unstructured data for your index and queries:
from redisvl.utils.vectorize import CohereTextVectorizer
# set COHERE_API_KEY in your environment
co = CohereTextVectorizer()
embedding = co.embed(
text="What is the capital city of France?",
input_type="search_query"
)
embeddings = co.embed_many(
texts=["my document chunk content", "my other document chunk content"],
input_type="search_document"
)
Learn more about using
redisvl
Vectorizers in your workflows here.
In order to perform well in production, modern GenAI applications require much more than vector search for retrieval. redisvl
provides some common extensions that
aim to improve applications working with LLMs:
LLM Semantic Caching is designed to increase application throughput and reduce the cost of using LLM models in production by leveraging previously generated knowledge.
from redisvl.extensions.llmcache import SemanticCache
# init cache with TTL (expiration) policy and semantic distance threshhold
llmcache = SemanticCache(
name="llmcache",
ttl=360,
redis_url="redis://localhost:6379"
)
llmcache.set_threshold(0.2) # can be changed on-demand
# store user queries and LLM responses in the semantic cache
llmcache.store(
prompt="What is the capital city of France?",
response="Paris",
metadata={}
)
# quickly check the cache with a slightly different prompt (before invoking an LLM)
response = llmcache.check(prompt="What is France's capital city?")
print(response[0]["response"])
>>> "Paris"
Learn more about Semantic Caching here.
LLM Session Management (COMING SOON) aims to improve personalization and accuracy of the LLM application by providing user chat session information and conversational memory.
LLM Contextual Access Control (COMING SOON) aims to improve security concerns by preventing malicious, irrelevant, or problematic user input from reaching LLMs and infrastructure.
To get started, check out the following guides:
Please help us by contributing PRs, opening GitHub issues for bugs or new feature ideas, improving documentation, or increasing test coverage. Read more about how to contribute!
This project is supported by Redis, Inc on a good faith effort basis. To report bugs, request features, or receive assistance, please file an issue.