sugarforever / chat-ollama

ChatOllama is an open source chatbot based on LLMs. It supports a wide range of language models, and knowledge base management.
MIT License
2.63k stars 410 forks source link

[Feature Request] 支持Voyage Embeddings/Reranker ? #216

Open meokey opened 6 months ago

meokey commented 6 months ago

建议支持Voyage Embedding。这是Anthropic的御用embeddings,虽然Voyage的价格OpenAI的还是贵一些,但是前50M tokens是免费的,羊毛够薅一阵子的了 🤣

Embeddings
We charge for requests to the Voyage embedding endpoint based on the number of tokens in the docs/queries.
The first 50 million tokens for voyage-2, voyage-large-2, and voyage-code-2 are free for every account. Subsequent usage is priced on a per-token basis. 
Reranker
Pricing for the Voyage reranker endpoint is based on the total number of processed tokens, calculated as “(the number of query tokens × the number of documents) + sum of the number of tokens in all documents”. The first 50 million tokens are free for each account. Subsequent usage is priced on a per-token basis as in the table below.
sugarforever commented 6 months ago

Ok. 这个还不错。我会尽快加上。

meokey commented 6 months ago

Reference:

为什么是两阶段检索?
知识库数据量大的场景下两阶段优势非常明显,如果只用一阶段embedding检索,随着数据量增大会出现检索降级的问题。二阶段rerank重排后能实现准确率稳定增长,即数据越多,效果越好。

QAnything:多类型文本的知识库,安全可靠、一键离线部署 mmexport1712460692012.png

sugarforever commented 6 months ago

前两天漏了这个图。蛮有意思的,我看看。

sugarforever commented 6 months ago

@meokey https://github.com/sugarforever/chat-ollama/pull/267 这个PR引入了Cohere的 Rerank 模型。通过API使用。

你能使用这个API吗?

本地模型来做rerank,我需要测试一下看看需要的计算资源及性能。暂时还没有考虑引入。如果大家确实需要这个功能支持,我们再评估。

meokey commented 6 months ago

@meokey #267 这个PR引入了Cohere的 Rerank 模型。通过API使用。

你能使用这个API吗?

我有cohere的Trial keys不过不确定是否可以用于rerank。。。用哪个docker image可以试试?好像latest image找不到cohere的设置。

本地模型来做rerank,我需要测试一下看看需要的计算资源及性能。暂时还没有考虑引入。如果大家确实需要这个功能支持,我们再评估。

要不要试试Voyage?Voyage不是本地的,也是通过API调用,而且这个是GA的,API key可以随便申请。主要是50m free token很香啊~~ image

meokey commented 6 months ago

@meokey #267 这个PR引入了Cohere的 Rerank 模型。通过API使用。

你能使用这个API吗?

我在/app/.env里加上了COHERE的API key,重启container后,新建了一个KB并chat,但我没有看到明显调用cohere的log; 然后在docker-compose文件里加了环境变量,这次似乎可以看到log的最后有rerank的提示但并不完整,而且rerank并没有找到正确文档(即使只有一个),chat的查询结果并不正确。


Ollama:  { host: 'http://xxx.xxx.xxx.xxx:11434', username: null, password: null }
Chat with knowledge base with id:  14
Knowledge base Kubernetes with embedding "text-embedding-3-small"
Creating embeddings for OpenAI model: text-embedding-3-small, host: keys.x_openai_api_host
Creating Chroma vector store
Initializing ParentDocumentRetriever with RedisDocstore
Redis client options:  {
  host: '1xxx.xxx.xxx.xxx',
Chat with Anthropic, host: 
User query:  Please list the Group Product Manager of the book "Kubernetes Secrets Handbook"
Relevant documents:  [
  Document {
    pageContent: 'Initiative (OCI)-compliant, Kubernetes ready, and has the ability to integrate with systemd \n' +
      'Design, implement, and maintain production-grade \n' +
      'Kubernetes Secrets management solutions\n' +
      'Emmanouil Gkatziouras\n' +
      'Rom Adams\n' +
      '• All the technical requirements mentioned at the beginning of this chapter\n' +
      '• Access to this book’s GitHub repository: https://github.com/PacktPublishing/\n' +
      loc: [Object]
    }
  },
      'The purpose of this section is to help you reflect on reshaping the traditional understanding of doing \n' +
      'security from an IT-centric perspective to practicing security while having a holistic understanding \n' +
      loc: [Object]
    }
      'who wants to understand how to effectively manage Secrets within that environment.\n' +
      'What this book covers\n' +
      'Chapter 1, Understanding Kubernetes Secrets Management, introduces you to Kubernetes and the \n' +
      'importance of Secrets management in applications deployed on Kubernetes. It gives an overview of \n' +
      'the challenges and risks associated with managing Secrets, the objectives, and the scope of the book.\n' +
      'Secrets management, including the different types of Secrets; their usage scenarios; how to create, \n' +
      'modify, and delete Secrets in Kubernetes; and secure storage and access control. It also covers how to \n' +
      'secret usage.',
    metadata: {
      source: 'Kubernetes Secret Handbook.pdf',
      'the instructions from the official documentation (https://minikube.sigs.k8s.io/\n' +
      'docs/start/).\n' +
      '• All of the code examples in the book are available on our dedicated GitHub repository with a \n' +
      'other Kubernetes objects?\n' +
      'One of the fundamental building blocks of Kubernetes is Kubernetes objects. Through Kubernetes \n' +
      loc: [Object]
    }
Cohere reranked documents:  [
  Document {

user
Please list the Group Product Manager of the book "Kubernetes Secrets Handbook"

assistant
Unfortunately, I do not have enough information to list the Group Product Manager of the book "Kubernetes Secrets Handbook". The provided context does not contain any details about the publishing or production aspects of the book, such as the individuals involved in its development. Without any relevant information given, I cannot determine or speculate about the Group Product Manager for this book. I apologize that I cannot provide a more complete answer to your question.

image

meokey commented 6 months ago

我又问了个问题,这次rerank似乎work了


Ollama:  { host: 'http://xxx.xxx.xxx.xxx:11434', username: null, password: null }
Chat with knowledge base with id:  14
Knowledge base Kubernetes with embedding "text-embedding-3-small"
Creating embeddings for OpenAI model: text-embedding-3-small, host: keys.x_openai_api_host
Creating Chroma vector store
Initializing ParentDocumentRetriever with RedisDocstore
  host: 'xxx.xxx.xxx.xxx',
  port: 6379,
  username: undefined,
  password: undefined
}
Chat with Anthropic, host: 
User query:  please summarize the book in 10 dots.
Relevant documents:  [
  Document {
Cohere reranked documents:  [
  Document {
      'local Kubernetes instances, easing the migration from containers to Pods, and even connecting \n' +
      'with remote platforms such as Red Hat OpenShift, Azure Kubernetes Engine, and more.\n' +
      '• Golang (https://go.dev) or Go is a programming language that will be used within our \n' +
      'examples. Note that Kubernetes and most of its third-party components are written in Go.\n' +
      '• Git (https://git-scm.com) is a version control system that we will be using to cover \n' +
      'this book’s examples but will also leverage in our discovery of Secrets management solutions.\n' +
      'This book’s GitHub repository contains the digital material linked to this book: https://github.\n' +
      'com/PacktPublishing/Kubernetes-Secrets-Handbook.',
    metadata: {
      source: 'Kubernetes Secret Handbook.pdf',
      blobType: 'application/pdf',
      pdf: [Object],
      loc: [Object],
      relevanceScore: 0.0012207952
    }
  },
  Document {
    pageContent: 'penetration testing, and risk evaluation forms a critical component of maintaining a secure and efficient \n' +
      'Secrets management framework within Kubernetes production environments.\n' +
      'Summary\n' +
      'In this chapter, we delved into the intricacies of managing Kubernetes Secrets within production \n' +
      'clusters. We highlighted the qualities necessary for effective Secrets management and examined \n' +
      'various deployment strategies and their integration with CI/CD processes. Additionally, we explored \n' +
      'a detailed case study on Keywhiz, which provided a thorough understanding of Secrets management \n' +
      'development, emphasizing a holistic approach that covers the entire lifecycle of Secrets management.\n' +
      'The next chapter will offer a synthesis of the insights and knowledge we’ve gained throughout the \n' +
      'book. It will also cast a forward-looking perspective on the evolution and future trends in Kubernetes \n' +
      'Secrets management.',
    metadata: {
      source: 'Kubernetes Secret Handbook.pdf',
      blobType: 'application/pdf',
      pdf: [Object],
      loc: [Object],
      relevanceScore: 0.0007350448
    }
  },
  Document {
    pageContent: 'Table of Contents \n' +
      'xiii\n' +
      '14\n' +
      'Conclusion and the Future of Kubernetes Secrets Management  249\n' +
      'The current state of Kubernetes 249\n' +
      'Native solutions 250\n' +
      'External solutions 252\n' +
      'The future state of Kubernetes 252\n' +
      'Food for thought and enhancements 252\n' +
      'How to share your thoughts 254\n' +
      'Continuous improvement 256\n' +
      'Skill acquisition 256\n' +
      'Automation as a strategy and Everything as \n' +
      'Code (EaC) 257\n' +
      'Summary                                                     258\n' +
      'Index                                                                                                                                259\n' +
      'Other Books You May Enjoy 270',
      'xi\n' +
      'Integration with other Azure components 149\n' +
      'Creating a Key Vault 153\n' +
      'Auditing and logging 157\n' +
      'Integration with other Google Cloud \n' +
      'components                                                                  163\n' +
      'Introduction to Workload Identity 163\n' +
      'Integrating GKE and GCP Secret \n' +
      'Manager                                                       163',
    metadata: {
      source: 'Kubernetes Secret Handbook.pdf',
      blobType: 'application/pdf',
      pdf: [Object],
      loc: [Object],
      relevanceScore: 0.00043393252
    }
  }
]
sugarforever commented 6 months ago

我找找你用的这个PDF,也测试一下。Rerank只解决排序,基于已经找到的文档片段,基于语义相似度给出最相近的分片。如果当时的文档分块已经截断了语义,那提问的预期回答信息可能是检索不到的。那么Rerank在这种情况下也帮不上忙。