run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.74k stars 5.27k forks source link

Response content was truncated #660

Closed willhamlam closed 1 year ago

willhamlam commented 1 year ago

Follow the starter guide but found out that using a Chinese query get a truncated response. A English query can get a full response.

Here is my code:

from langchain.llms import OpenAI
from pathlib import Path
from llama_index import GPTSimpleVectorIndex, LLMPredictor, PromptHelper, download_loader
import os

# Set your OpenAI key
os.environ["OPENAI_API_KEY"] = ""

def construct_index():
  # Initialize the pdf loader and load pdf file, export as a documents
  PDFReader = download_loader("PDFReader")
  loader = PDFReader()
  documents = loader.load_data(file=Path('./duan.pdf'))

  # define prompt helper
  # set maximum input size
  max_input_size = 4096
  # set number of output tokens
  num_outputs = 1000
  # set maximum chunk overlap
  max_chunk_overlap = 20

  prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap)

  # define LLM
  llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="text-davinci-003", max_tokens=num_outputs))

  # Build index from the Documents 
  # Using openai to make index
  # save to disk as index.json
  index = GPTSimpleVectorIndex(
    documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
  )
  index.save_to_disk('index.json')
  print("create index.json success")

def ask_bot():
  if os.path.exists('index.json'):
    index = GPTSimpleVectorIndex.load_from_disk('index.json')
    response = index.query("首先总结一下这篇论文", response_mode="compact")
    print(response)
  else:
    construct_index()

ask_bot()

CleanShot 2023-03-08 at 10 48 36@2x

Should I set the max tokens or other settings?

willhamlam commented 1 year ago

Already changed the max tokens from 200 to 1000. But the output of the response seems to be the same.

ashutoshsinha25 commented 1 year ago

im facing the same issue with English text response

jerryjliu commented 1 year ago

hi @willhamlam, if you're saving and loading from disk, you have to respecify llm_predictor during load-time or during the query call e.g.

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)

Minweiwangaaaa commented 1 year ago

The same

ashutoshsinha25 commented 1 year ago

@jerryjliu thanks that helped

Disiok commented 1 year ago

Resolved

karottc commented 1 year ago

hi @willhamlam, if you're saving and loading from disk, you have to respecify llm_predictor during load-time or during the query call e.g.

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)

Another error has occurred here, with the error message as follows. Is it because the API of this library has changed? However, I haven't found the new usage method.

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor) File "/xxxx/python3.8/site-packages/llama_index/indices/base.py", line 352, in load_from_disk return cls.load_from_string(file_contents, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/base.py", line 328, in load_from_string return cls.load_from_dict(result_dict, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/vector_store/base.py", line 260, in load_from_dict return super().load_from_dict(result_dict, config_dict, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/base.py", line 305, in load_from_dict return cls(index_struct=index_struct, docstore=docstore, **kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/vector_store/vector_indices.py", line 94, in init super().init( File "/xx/lib/python3.8/site-packages/llama_index/indices/vector_store/base.py", line 58, in init super().init( TypeError: init() got an unexpected keyword argument 'llm_predictor'

karottc commented 1 year ago

hi @willhamlam, if you're saving and loading from disk, you have to respecify llm_predictor during load-time or during the query call e.g. index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)

Another error has occurred here, with the error message as follows. Is it because the API of this library has changed? However, I haven't found the new usage method.

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor) File "/xxxx/python3.8/site-packages/llama_index/indices/base.py", line 352, in load_from_disk return cls.load_from_string(file_contents, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/base.py", line 328, in load_from_string return cls.load_from_dict(result_dict, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/vector_store/base.py", line 260, in load_from_dict return super().load_from_dict(result_dict, config_dict, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/base.py", line 305, in load_from_dict return cls(index_struct=index_struct, docstore=docstore, kwargs) File "/xxx/python3.8/site-packages/llama_index/indices/vector_store/vector_indices.py", line 94, in init super().init( File "/xx/lib/python3.8/site-packages/llama_index/indices/vector_store/base.py", line 58, in init super().init( TypeError: init**() got an unexpected keyword argument 'llm_predictor'

Already resolved, please check this post: https://github.com/jerryjliu/llama_index/issues/1033

medinamaria90 commented 1 year ago

hi @willhamlam, if you're saving and loading from disk, you have to respecify llm_predictor during load-time or during the query call e.g.

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)

Can you clarify? this gives me this error: TypeError: BaseGPTIndex.init() got an unexpected keyword argument 'llm_predictor'

This is my code:

model = "gpt-3.5-turbo"

def construct_index(directory_path, model = model): max_input_size = 1000 num_outputs = 250 chunk_size_limit = 600 max_chunk_overlap = 20 if model == "text-davinci-003": llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.4, model_name=model, max_tokens=num_outputs)) else: llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0.4, model_name=model, max_tokens=num_outputs)) prompt_helper = PromptHelper(max_input_size, num_outputs, max_chunk_overlap, chunk_size_limit=chunk_size_limit) documents = SimpleDirectoryReader(directory_path).load_data() service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, prompt_helper=prompt_helper) index = GPTSimpleVectorIndex.from_documents(documents, service_context=service_context) index.save_to_disk(f'static/docs/{model}.json') return index

def ask_bot(input_index, question, llm_predictor): index = GPTSimpleVectorIndex.load_from_disk(input_index, llm_predictor=llm_predictor) response = index.query(question, response_mode="compact") bot_answer = ("\nChatBot: \n\n" + response.response + "\n\n\n") return bot_answer

txgaritano commented 1 year ago

Before I was able to get long responses with "index.query", but it seems that today it has been an update and when I run "query_engine.query" to get a response, they get truncated. Is there something that needs to be changed in "query_engine.query" to get longer responses?

Jafo232 commented 1 year ago
index = GPTSimpleVectorIndex.load_from_disk('index.json')

Try:

llm_predictor = LLMPredictor(llm=OpenAI(temperature=0, model_name="gpt-3.5-turbo", max_tokens=250))

index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)
ethayu commented 1 year ago

Hello - did the same, but I'm getting the following error: ValueError: llm must be an instance of langchain.llms.base.LLM My code: llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0.7, model_name="gpt-3.5-turbo", max_tokens=256)) index = GPTSimpleVectorIndex.load_from_disk('index.json', llm_predictor=llm_predictor)