RobinQu / instinct.cpp

instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG, Chatbot, Code interpreter) powered by language models. Call it langchain.cpp if you like.
Apache License 2.0
37 stars 2 forks source link

First version of file-search tool for assistant-api #20

Closed RobinQu closed 4 months ago

RobinQu commented 5 months ago

file search

search pipeline

https://platform.openai.com/docs/assistants/tools/file-search/how-it-works

The file_search tool implements several retrieval best practices out of the box to help you extract the right data from your files and augment the model’s responses. The file_search tool:

  • Rewrites user queries to optimize them for search.
  • Breaks down complex user queries into multiple searches it can run in parallel.
  • Runs both keyword and semantic searches across both assistant and thread vector stores.
  • Reranks search results to pick the most relevant ones before generating the final response.

online search sources

https://platform.openai.com/docs/assistants/tools/file-search/vector-stores

Each vector_store can hold up to 10,000 files. Today, you can attach at most one vector store to an assistant and at most one vector store to a thread.

vector store source:

tool choices

Does it always trigger file search if any vs is configured? It seems it's not anymore.

Read about users' complains after V2 is released.

I guess that internal agent will decide if it's necessary to call file-search.

Another discussion about how file search tool works: https://community.openai.com/t/how-knowledge-base-files-are-handled-assistants-api/601721/14

data expiration

https://platform.openai.com/docs/assistants/tools/file-search/managing-costs-with-expiration-policies

Vector stores created using thread helpers (like tool_resources.file_search.vector_stores in Threads or message.attachments in Messages) have a default expiration policy of 7 days after they were last active (defined as the last time the vector store was part of a run).

data deletion

RobinQu commented 4 months ago

More implementaion details

Content annotation in file search

Existiting methods

Example 1: langchain

https://python.langchain.com/v0.2/docs/how_to/qa_citations/#setup

Example 2: llmware

https://medium.com/@darrenoberst/using-llmware-for-rag-evidence-verification-8611abf2dbeb https://github.com/llmware-ai/llmware

use retrieval based method to provide evidences for more reliable RAG pipeline.

Example 3: A more elaborate prompt guided citations

https://medium.com/@yotamabraham/in-text-citing-with-langchain-question-answering-e19a24d81e39

response = llm.call_as_llm(f"{qdocs} Question: Please answer the question with citation to the paragraphs. /
 For every sentence you write, cite the book name and paragraph number as <id_x_x> /

 At the end of your commentary: 
 1. Add key words from the book paragraphs. / 
 2. Suggest a further question that can be answered by the paragraphs provided. / 
 3. Create a sources list of book names, paragraph Number author name, and a link for each book you cited.")

display(Markdown(response))

According to the book The International Journalism Handbook by 
Rodrigo Zamith, technology has played a significant role in shaping todays 
journalistic work (s_225_100). The development of the printing press 
allowed for the mass distribution of journalism, although it also imposed 
limitations on the formats that journalistic products could take (s_225_100). 
The telegraph enabled the development of newswire services and facilitated 
quick transmission of reports from remote locations (s_225_100). 
On the other hand, the proliferation of the telephone allowed reporters to 
conduct more reporting from within the newsroom by directly contacting 
their sources (s_225_100).

Technological actants have also influenced the way news audiences and 
journalists communicate with each other (s_274_100). 
Platforms like Twitter have made it easier for audience members to 
provide immediate and public feedback to journalists, leading to more 
meaningful and direct audience participation (s_274_100). However, this 
can also result in negative forms of participation, such as brigading
and strategic harassment of journalists (s_274_100).

In recent times, journalists are more likely to work in teams, 
collaborate across organizations, and involve their audiences in 
various aspects of news production (s_914_100). 
This shift has moved away from the historical practice of journalists 
working in a more solitary fashion (s_914_100).

The accessibility of news content and sources has increased significantly, 
allowing news audiences to have access to a wide range of options (s_268_100). 
This has made it challenging for a single journalistic outlet to gain a 
near-monopoly on audiences (s_268_100). However, a few large organizations 
with strong brand recognition can still capture substantial audiences, 
while smaller journalistic outlets cater to niche audiences and are often 
considered interchangeable by users (s_268_100).

Keywords: technology, printing press, telegraph, telephone, 
audience participation, news production, news content accessibility, 
journalistic outlets.

Further question: How has the evolution of technology impacted the 
credibility and trustworthiness of journalistic outlets?

Sources:

Book: The International Journalism Handbook
Paragraph numbers: s_225_100, s_274_100, s_914_100, s_268_100
Author: Rodrigo Zamith
Link: https://books.rodrigozamith.com/the-international-journalism-handbook/

Problems

Some tests

Direct prompting with mixtral-8x7b-q6-guff

rag_with_citation.txt

 Cheetahs are capable of running at speeds between 93 to 104 kilometers per hour (58 to 65 miles per hour) (id.1).

Despite their impressive speed, cheetahs only score at 16 body lengths per second, which is lower than Anna's hummingbird's length-specific velocity (id.3).

1. Cheetah speed
2. Running speed of cheetahs

Quotations from context information:

* "The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph)" (id.1)
* "it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail" (id.1)
* "Anna's hummingbird has the highest known length-specific velocity attained by any vertebrate" (id.3)
* "The cheetah, the fastest land mammal, scores at only 16 body lengths per second" (id.3)

Generation post-processing

generation_post_processing.txt

Purposed solution

Use LLM to post-process final answer.

input_parser  -(prompt)->  agent_exectuor [  file_search, and other tool uses ]  - (answer)  -> annotator -> (final answer with annotations)