Closed RobinQu closed 4 months ago
https://python.langchain.com/v0.2/docs/how_to/qa_citations/#setup
https://medium.com/@darrenoberst/using-llmware-for-rag-evidence-verification-8611abf2dbeb https://github.com/llmware-ai/llmware
use retrieval based method to provide evidences for more reliable RAG pipeline.
https://medium.com/@yotamabraham/in-text-citing-with-langchain-question-answering-e19a24d81e39
response = llm.call_as_llm(f"{qdocs} Question: Please answer the question with citation to the paragraphs. /
For every sentence you write, cite the book name and paragraph number as <id_x_x> /
At the end of your commentary:
1. Add key words from the book paragraphs. /
2. Suggest a further question that can be answered by the paragraphs provided. /
3. Create a sources list of book names, paragraph Number author name, and a link for each book you cited.")
display(Markdown(response))
According to the book The International Journalism Handbook by
Rodrigo Zamith, technology has played a significant role in shaping todays
journalistic work (s_225_100). The development of the printing press
allowed for the mass distribution of journalism, although it also imposed
limitations on the formats that journalistic products could take (s_225_100).
The telegraph enabled the development of newswire services and facilitated
quick transmission of reports from remote locations (s_225_100).
On the other hand, the proliferation of the telephone allowed reporters to
conduct more reporting from within the newsroom by directly contacting
their sources (s_225_100).
Technological actants have also influenced the way news audiences and
journalists communicate with each other (s_274_100).
Platforms like Twitter have made it easier for audience members to
provide immediate and public feedback to journalists, leading to more
meaningful and direct audience participation (s_274_100). However, this
can also result in negative forms of participation, such as brigading
and strategic harassment of journalists (s_274_100).
In recent times, journalists are more likely to work in teams,
collaborate across organizations, and involve their audiences in
various aspects of news production (s_914_100).
This shift has moved away from the historical practice of journalists
working in a more solitary fashion (s_914_100).
The accessibility of news content and sources has increased significantly,
allowing news audiences to have access to a wide range of options (s_268_100).
This has made it challenging for a single journalistic outlet to gain a
near-monopoly on audiences (s_268_100). However, a few large organizations
with strong brand recognition can still capture substantial audiences,
while smaller journalistic outlets cater to niche audiences and are often
considered interchangeable by users (s_268_100).
Keywords: technology, printing press, telegraph, telephone,
audience participation, news production, news content accessibility,
journalistic outlets.
Further question: How has the evolution of technology impacted the
credibility and trustworthiness of journalistic outlets?
Sources:
Book: The International Journalism Handbook
Paragraph numbers: s_225_100, s_274_100, s_914_100, s_268_100
Author: Rodrigo Zamith
Link: https://books.rodrigozamith.com/the-international-journalism-handbook/
file_citation
field in annoataions
, but text
field, which acts as special marks in resposne, is unclear for re-production. start_index
and end_index
are aparrently implementation dependent, and thus impossible to reproduce unless OpenAI dis-closes more details.mixtral-8x7b-q6-guff
Cheetahs are capable of running at speeds between 93 to 104 kilometers per hour (58 to 65 miles per hour) (id.1).
Despite their impressive speed, cheetahs only score at 16 body lengths per second, which is lower than Anna's hummingbird's length-specific velocity (id.3).
1. Cheetah speed
2. Running speed of cheetahs
Quotations from context information:
* "The cheetah is capable of running at 93 to 104 km/h (58 to 65 mph)" (id.1)
* "it has evolved specialized adaptations for speed, including a light build, long thin legs and a long tail" (id.1)
* "Anna's hummingbird has the highest known length-specific velocity attained by any vertebrate" (id.3)
* "The cheetah, the fastest land mammal, scores at only 16 body lengths per second" (id.3)
generation_post_processing.txt
Use LLM to post-process final answer.
input_parser -(prompt)-> agent_exectuor [ file_search, and other tool uses ] - (answer) -> annotator -> (final answer with annotations)
file search
search pipeline
https://platform.openai.com/docs/assistants/tools/file-search/how-it-works
online search sources
https://platform.openai.com/docs/assistants/tools/file-search/vector-stores
vector store source:
tool_resources
on assistant object -> vector_store_idtool_resources
on thread object -> vector_store_idattachments
on user message. -> file_id -> create a new VS or insert into VS of this thread?tool choices
Does it always trigger file search if any vs is configured? It seems it's not anymore.
Read about users' complains after V2 is released.
I guess that internal agent will decide if it's necessary to call file-search.
Another discussion about how file search tool works: https://community.openai.com/t/how-knowledge-base-files-are-handled-assistants-api/601721/14
data expiration
https://platform.openai.com/docs/assistants/tools/file-search/managing-costs-with-expiration-policies
data deletion