langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
92.04k stars 14.65k forks source link

Issue: RetrievalQA response incomplete #7177

Closed Lin-jun-xiang closed 1 year ago

Lin-jun-xiang commented 1 year ago

Issue you'd like to raise.

Problem descriptions

I've used retrievalQA.from_chain_type() with refine type to design a chatPDF.

But the response often incomplete, see the following result, the Answer is not complete which will let the json.loads not work.

Futhermore, I've used get_openai_callback to check if the token exceeds the limit.

Howerver, the callback show the total token is 3432 which didn't exceeds the limit.

Questions

  1. Why the response incomplete?
  2. How can I let the response complete then I can do json.loads

Code

from datetime import datetime
from typing import List

import langchain
from langchain.callbacks import get_openai_callback
from langchain.chains import RetrievalQA
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
from langchain.vectorstores import Chroma
from langchain.document_loaders import PyMuPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from pydantic import BaseModel

docs = PyMuPDFLoader('file.pdf').load()
splitter = RecursiveCharacterTextSplitter(docs, chunk_size=1000, chunk_overlap=300)
docs = splitter.split_documents(docs)

class Specification(BaseModel):
    product_name: str
    manufactured_date: datetime
    size_inch: str
    resolution: str
    contrast: str
    operation_temperature: str
    power_supply: str
    sunlight_readable: bool
    antiglare: bool
    low_power_consumption: bool
    high_brightness: bool
    wide_temperature: bool
    fast_response: bool
    screen_features: List[str]

parser = PydanticOutputParser(pydantic_object=Specification)

prompt_template = """
Use the following pieces of context to answer the question, if you don't know the answer, leave it blank('') don't try to make up an answer.

{context_str}

Question: {question}
{format_instructions}
"""

prompt = PromptTemplate(
    template=prompt_template,
    input_variables=['context_str', 'question'],
    partial_variables={'format_instructions': parser.get_format_instructions()}
)

chain_type_kwargs = {
    'question_prompt': prompt,
    'verbose': True
}

llm = OpenAI(temperature=0)

embeddings = OpenAIEmbeddings()

db = Chroma.from_documents(
    documents=docs,
    embedding= embeddings
)
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type = 'refine',
    retriever= db.as_retriever(),
    chain_type_kwargs=chain_type_kwargs
)

query = f"""What's the display specifications?"""

with get_openai_callback() as cb:
    res = qa_chain.run(query)
    print(cb, '\n')
    print(res)

Process of Chain

> Entering new  chain...

> Entering new  chain...
Prompt after formatting:

Use the following pieces of context to answer the question, if you don't know the answer, leave it blank('') don't try to make up an answer.
Dimensions (W × H × D)
Touch screen: 7.0" × 5.1" × 0.75" (178 × 130 × 19 mm)
Dock: 4.2" × 2.3" × 3.1" (106 × 57 × 78 mm)
Weight
Touch screen: 0.7 lbs. (0.32 kg)
Dock: 0.8 lbs (0.36 kg)
Planning the installation
• Make sure the dock can be located near a power outlet and a strong WiFi signal.
• For the most reliable connection, we recommend running Ethernet to the dock.
• Make sure the touch screen's WiFi or Ethernet connection is on the same network as your controller and that the signal is strong.
• Communication agent is required for intercom use.
• Charge the touch screen for at least six hours before use.
For more product information
Visit ctrl4.co/t3series
Control4® T3-7 7" Tabletop Touch Screen
DOC-00148-C 
2015-10-09 MSC
Copyright ©2015, Control4 Corporation. All rights reserved. Control4, the Control4 logo, the 4-ball logo, 4Store, 
4Sight, Control My Home, Everyday Easy, and Mockupancy are registered trademarks or trademarks of Control4
Question: What's the display specifications?
The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
{"properties": {"product_name": {"title": "Product Name", "type": "string"}, "manufactured_date": {"title": "Manufactured Date", "type": "string", "format": "date-time"}, "size_inch": {"title": "Size Inch", "type": "string"}, "resolution": {"title": "Resolution", "type": "string"}, "contrast": {"title": "Contrast", "type": "string"}, "operation_temperature": {"title": "Operation Temperature", "type": "string"}, "power_supply": {"title": "Power Supply", "type": "string"}, "sunlight_readable": {"title": "Sunlight Readable", "type": "boolean"}, "antiglare": {"title": "Antiglare", "type": "boolean"}, "low_power_consumption": {"title": "Low Power Consumption", "type": "boolean"}, "high_brightness": {"title": "High Brightness", "type": "boolean"}, "wide_temperature": {"title": "Wide Temperature", "type": "boolean"}, "fast_response": {"title": "Fast Response", "type": "boolean"}, "screen_features": {"title": "Screen Features", "type": "array", "items": {"type": "string"}}}, "required": ["product_name", "manufactured_date", "size_inch", "resolution", "contrast", "operation_temperature", "power_supply", "sunlight_readable", "antiglare", "low_power_consumption", "high_brightness", "wide_temperature", "fast_response", "screen_features"]}


> Finished chain.

> Entering new  chain...
Prompt after formatting:
The original question is as follows: What's the display specifications?
We have provided an existing answer: 
Answer: {
  "product_name": "Control4® T3-7 7\" Tabletop Touch Screen",
  "manufactured_date": "2015-10-09",
  "size_inch": "7.0\" × 5.1\" × 0.75\"",
  "resolution": "N/A",
  "contrast": "N/A",
  "operation_temperature": "N/A",
  "power_supply": "N/A",
  "sunlight_readable": "N/A",
  "antiglare": "N/A",
  "low_power_consumption": "N/A",
  "high_brightness": "N/A",
  "wide_temperature": "N/A",
  "fast_response": "N/A",
  "screen_features": []
}
We have the opportunity to refine the existing answer(only if needed) with some more context below.
------------
Model numbers
C4-TT7-1-BL, C4-TT7-1-WH, C4-TT7-1-RD
Features
Screen
Resolution: 1280 × 800
Capacitive touch
Camera: 720p
Network
Ethernet or WiFi (802.11 g/n [2.4 GHz])
Notes: 
(1) 802.11b is not supported for Video Intercom. 
(2) Wireless-N is recommended for Video Intercom. The more devices that Video Intercom is broadcast to, the 
more response time and images are degraded.
Battery
3100 mAh Li-ion
Power supply
PoE (IEEE802.3af)
100VAC ~ 240VAC, 50-60 Hz
International power supply adapters included
Dock connections
• 
Ethernet
• 
PoE
• 
DC power
Mounting
Tabletop or portable
Environmental
Operating temperature
32 ~ 104˚F (0˚ ~ 40˚C)
Storage temperature
32 ~ 104˚F (0˚ ~ 40˚C)
Dimensions (W × H × D)
Touch screen: 7.0" × 5.1" × 0.75" (178 × 130 × 19 mm)
Dock: 4.2" × 2.3" × 3.1" (106 × 57 × 78 mm)
Weight
Touch screen: 0.7 lbs. (0.32 kg)
Dock: 0.8 lbs (0.36 kg)
Planning the installation
• Make sure the dock can be located near a power outlet and a strong WiFi signal.
------------
Given the new context, refine the original answer to better answer the question. If the context isn't useful, return the original answer.

> Finished chain.

> Entering new  chain...
Prompt after formatting:
The original question is as follows: What's the display specifications?
We have provided an existing answer: 

Answer: {
  "product_name": "Control4® T3-7 7\" Tabletop Touch Screen",
  "manufactured_date": "2015-10-09",
  "model_numbers": ["C4-TT7-1-BL", "C4-TT7-1-WH", "C4-TT7-1-RD"],
  "size_inch": "7.0\" × 5.1\" × 0.75\"",
  "resolution": "1280 × 800",
  "contrast": "N/A",
  "operation_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "storage_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "power_supply": "PoE (IEEE802.3af) 100VAC ~ 240VAC, 50-60 Hz",
  "sunlight_readable": "N/A",
  "antiglare": "N/A",
  "low_power_consumption": "N/A",
  "high_bright
We have the opportunity to refine the existing answer(only if needed) with some more context below.
------------
Control4® T3-7 7" Tabletop Touch Screen
DOC-00148-C 
2015-10-09 MSC
Copyright ©2015, Control4 Corporation. All rights reserved. Control4, the Control4 logo, the 4-ball logo, 4Store, 
4Sight, Control My Home, Everyday Easy, and Mockupancy are registered trademarks or trademarks of Control4 
Corporation in the United States and/or other countries. All other names and brands may be claimed as the 
property of their respective owners. All specifications subject to change without notice.
------------
Given the new context, refine the original answer to better answer the question. If the context isn't useful, return the original answer.

> Finished chain.

> Entering new  chain...
Prompt after formatting:
The original question is as follows: What's the display specifications?
We have provided an existing answer: 

Answer: {
  "product_name": "Control4® T3-7 7\" Tabletop Touch Screen",
  "manufactured_date": "2015-10-09",
  "model_numbers": ["C4-TT7-1-BL", "C4-TT7-1-WH", "C4-TT7-1-RD"],
  "size_inch": "7.0\" × 5.1\" × 0.75\"",
  "resolution": "1280 × 800",
  "contrast": "N/A",
  "operation_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "storage_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "power_supply": "PoE (IEEE802.3af) 100VAC ~ 240VAC, 50-60 Hz",
  "sunlight_readable": "N/A",
  "antiglare": "N/A",
  "low_power_consumption": "N/A",
  "high_bright
We have the opportunity to refine the existing answer(only if needed) with some more context below.
------------
Control4® T3 Series 7" Tabletop Touch Screen
The Control4® T3 Series Tabletop Touch Screen delivers always-on, dedicated, and mobile control over all the technology in your home 
or business. Featuring a gorgeous new tablet design and stunning high-resolution graphics, this portable screen looks beautiful whether 
on a kitchen countertop or in the theater on your lap. This model includes HD video intercom and crystal-clear audio intercom for 
convenient communications from room to room or with visitors at the door.
• Available in a 7" model, the T3 Series Portable Touch Screen provides dedicated, elegant, and mobile control of your home.
• HD camera, combined with speakers and microphone, provides the best video intercom experience yet.
• Crisp picture with two and a half times the resolution of previous models.
• Extremely fast and responsive—up to 16 times faster than our previous touch screens.
------------
Given the new context, refine the original answer to better answer the question. If the context isn't useful, return the original answer.

> Finished chain.

> Finished chain.
Tokens Used: 3432
    Prompt Tokens: 2465
    Completion Tokens: 967
Successful Requests: 4
Total Cost (USD): $0.06864

Result

Tokens Used: 3432
    Prompt Tokens: 2465
    Completion Tokens: 967
Successful Requests: 4
Total Cost (USD): $0.06864

Answer: {
  "product_name": "Control4® T3 Series 7\" Tabletop Touch Screen",
  "manufactured_date": "2015-10-09",
  "model_numbers": ["C4-TT7-1-BL", "C4-TT7-1-WH", "C4-TT7-1-RD"],
  "size_inch": "7.0\" × 5.1\" × 0.75\"",
  "resolution": "1280 × 800",
  "contrast": "N/A",
  "operation_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "storage_temperature": "32 ~ 104˚F (0˚ ~ 40˚C)",
  "power_supply": "PoE (IEEE802.3af) 100VAC ~ 240VAC, 50-60 Hz",
  "sunlight_readable": "N/A",
  "antiglare": "N/A",
  "low_power_consumption": "N/A",
  "high_brightness

Suggestion:

No response

Lin-jun-xiang commented 1 year ago

llm=OpenAI(max_token=-1) can solved this problem.