JSONDecodeError on Cookbook/Semi_structured_multi_mudal_RAG_LLaMA2.ipynb

System Info

Python Version: Python 3.10.4 Langchain Version: 0.0.320 OS: Ubuntu 18.04

Who can help?

@hwchase17 @agola11

Information

[X] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[X] LLMs/Chat Models
[ ] Embedding Models
[ ] Prompts / Prompt Templates / Prompt Selectors
[ ] Output Parsers
[ ] Document Loaders
[ ] Vector Stores / Retrievers
[ ] Memory
[ ] Agents / Agent Executors
[ ] Tools / Toolkits
[ ] Chains
[ ] Callbacks/Tracing
[ ] Async

Reproduction

When running the official cookbook code...

from pydantic import BaseModel
from typing import Any, Optional
from unstructured.partition.pdf import partition_pdf

# Path to save images
path = "./Papers/LLaVA/"

# Get elements
raw_pdf_elements = partition_pdf(filename=path+"LLaVA.pdf",
                                 # Using pdf format to find embedded image blocks
                                 extract_images_in_pdf=True,
                                 # Use layout model (YOLOX) to get bounding boxes (for tables) and find titles
                                 # Titles are any sub-section of the document 
                                 infer_table_structure=True, 
                                 # Post processing to aggregate text once we have the title 
                                 chunking_strategy="by_title",
                                 # Chunking params to aggregate text blocks
                                 # Attempt to create a new chunk 3800 chars
                                 # Attempt to keep chunks > 2000 chars 
                                 # Hard max on chunks
                                 max_characters=4000, 
                                 new_after_n_chars=3800, 
                                 combine_text_under_n_chars=2000,
                                 image_output_dir_path=path
                                 )
# Create a dictionary to store counts of each type
category_counts = {}

for element in raw_pdf_elements:
    category = str(type(element))
    if category in category_counts:
        category_counts[category] += 1
    else:
        category_counts[category] = 1

# Unique_categories will have unique elements
# TableChunk if Table > max chars set above
unique_categories = set(category_counts.keys())
category_counts

class Element(BaseModel):
    type: str
    text: Any

# Categorize by type
categorized_elements = []
for element in raw_pdf_elements:
    if "unstructured.documents.elements.Table" in str(type(element)):
        categorized_elements.append(Element(type="table", text=str(element)))
    elif "unstructured.documents.elements.CompositeElement" in str(type(element)):
        categorized_elements.append(Element(type="text", text=str(element)))

# Tables
table_elements = [e for e in categorized_elements if e.type == "table"]
print(len(table_elements))

# Text
text_elements = [e for e in categorized_elements if e.type == "text"]
print(len(text_elements))

from langchain.chat_models import ChatOllama
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser

# Prompt 
prompt_text="""You are an assistant tasked with summarizing tables and text. \ 
Give a concise summary of the table or text. Table or text chunk: {element} """
prompt = ChatPromptTemplate.from_template(prompt_text) 

# Summary chain 
model = ChatOllama(model="llama2:13b-chat")
summarize_chain = {"element": lambda x:x} | prompt | model | StrOutputParser()

# Apply to text
texts = [i.text for i in text_elements if i.text != ""]
text_summaries = summarize_chain.batch(texts, {"max_concurrency": 5})

The following error was returned:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
File ~/anaconda3/envs/langchain/lib/python3.8/site-packages/requests/models.py:971, in Response.json(self, **kwargs)
    970 try:
--> 971     return complexjson.loads(self.text, **kwargs)
    972 except JSONDecodeError as e:
    973     # Catch JSON-related errors and raise as requests.JSONDecodeError
    974     # This aliases json.JSONDecodeError and simplejson.JSONDecodeError

File ~/anaconda3/envs/langchain/lib/python3.8/json/__init__.py:357, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    354 if (cls is None and object_hook is None and
    355         parse_int is None and parse_float is None and
    356         parse_constant is None and object_pairs_hook is None and not kw):
--> 357     return _default_decoder.decode(s)
    358 if cls is None:

File ~/anaconda3/envs/langchain/lib/python3.8/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 """Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 """
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File ~/anaconda3/envs/langchain/lib/python3.8/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
    354 except StopIteration as err:
--> 355     raise JSONDecodeError("Expecting value", s, err.value) from None
    356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

JSONDecodeError                           Traceback (most recent call last)
Cell In[6], line 3
      1 # Apply to text
      2 texts = [i.text for i in text_elements if i.text != ""]
----> 3 text_summaries = summarize_chain.batch(texts, {"max_concurrency": 5})

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/schema/runnable/base.py:1287, in RunnableSequence.batch(self, inputs, config, return_exceptions, **kwargs)
   1285     else:
   1286         for i, step in enumerate(self.steps):
-> 1287             inputs = step.batch(
   1288                 inputs,
   1289                 [
   1290                     # each step a child run of the corresponding root run
   1291                     patch_config(
   1292                         config, callbacks=rm.get_child(f"seq:step:{i+1}")
   1293                     )
   1294                     for rm, config in zip(run_managers, configs)
   1295                 ],
   1296             )
   1298 # finish the root runs
   1299 except BaseException as e:

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/schema/runnable/base.py:323, in Runnable.batch(self, inputs, config, return_exceptions, **kwargs)
    320     return cast(List[Output], [invoke(inputs[0], configs[0])])
    322 with get_executor_for_config(configs[0]) as executor:
--> 323     return cast(List[Output], list(executor.map(invoke, inputs, configs)))

File ~/anaconda3/envs/langchain/lib/python3.8/concurrent/futures/_base.py:619, in Executor.map.<locals>.result_iterator()
    616 while fs:
    617     # Careful not to keep a reference to the popped future
    618     if timeout is None:
--> 619         yield fs.pop().result()
    620     else:
    621         yield fs.pop().result(end_time - time.monotonic())

File ~/anaconda3/envs/langchain/lib/python3.8/concurrent/futures/_base.py:444, in Future.result(self, timeout)
    442     raise CancelledError()
    443 elif self._state == FINISHED:
--> 444     return self.__get_result()
    445 else:
    446     raise TimeoutError()

File ~/anaconda3/envs/langchain/lib/python3.8/concurrent/futures/_base.py:389, in Future.__get_result(self)
    387 if self._exception:
    388     try:
--> 389         raise self._exception
    390     finally:
    391         # Break a reference cycle with the exception in self._exception
    392         self = None

File ~/anaconda3/envs/langchain/lib/python3.8/concurrent/futures/thread.py:57, in _WorkItem.run(self)
     54     return
     56 try:
---> 57     result = self.fn(*self.args, **self.kwargs)
     58 except BaseException as exc:
     59     self.future.set_exception(exc)

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/schema/runnable/base.py:316, in Runnable.batch.<locals>.invoke(input, config)
    314         return e
    315 else:
--> 316     return self.invoke(input, config, **kwargs)

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/base.py:142, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
    131 def invoke(
    132     self,
    133     input: LanguageModelInput,
   (...)
    137     **kwargs: Any,
    138 ) -> BaseMessage:
    139     config = config or {}
    140     return cast(
    141         ChatGeneration,
--> 142         self.generate_prompt(
    143             [self._convert_input(input)],
    144             stop=stop,
    145             callbacks=config.get("callbacks"),
    146             tags=config.get("tags"),
    147             metadata=config.get("metadata"),
    148             run_name=config.get("run_name"),
    149             **kwargs,
    150         ).generations[0][0],
    151     ).message

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/base.py:459, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    451 def generate_prompt(
    452     self,
    453     prompts: List[PromptValue],
   (...)
    456     **kwargs: Any,
    457 ) -> LLMResult:
    458     prompt_messages = [p.to_messages() for p in prompts]
--> 459     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/base.py:349, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    347         if run_managers:
    348             run_managers[i].on_llm_error(e)
--> 349         raise e
    350 flattened_outputs = [
    351     LLMResult(generations=[res.generations], llm_output=res.llm_output)
    352     for res in results
    353 ]
    354 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/base.py:339, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    336 for i, m in enumerate(messages):
    337     try:
    338         results.append(
--> 339             self._generate_with_cache(
    340                 m,
    341                 stop=stop,
    342                 run_manager=run_managers[i] if run_managers else None,
    343                 **kwargs,
    344             )
    345         )
    346     except BaseException as e:
    347         if run_managers:

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/base.py:492, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
    488     raise ValueError(
    489         "Asked to cache, but no cache found at `langchain.cache`."
    490     )
    491 if new_arg_supported:
--> 492     return self._generate(
    493         messages, stop=stop, run_manager=run_manager, **kwargs
    494     )
    495 else:
    496     return self._generate(messages, stop=stop, **kwargs)

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/chat_models/ollama.py:98, in ChatOllama._generate(self, messages, stop, run_manager, **kwargs)
     80 """Call out to Ollama's generate endpoint.
     81 
     82 Args:
   (...)
     94         ])
     95 """
     97 prompt = self._format_messages_as_text(messages)
---> 98 final_chunk = super()._stream_with_aggregation(
     99     prompt, stop=stop, run_manager=run_manager, verbose=self.verbose, **kwargs
    100 )
    101 chat_generation = ChatGeneration(
    102     message=AIMessage(content=final_chunk.text),
    103     generation_info=final_chunk.generation_info,
    104 )
    105 return ChatResult(generations=[chat_generation])

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/llms/ollama.py:156, in _OllamaCommon._stream_with_aggregation(self, prompt, stop, run_manager, verbose, **kwargs)
    147 def _stream_with_aggregation(
    148     self,
    149     prompt: str,
   (...)
    153     **kwargs: Any,
    154 ) -> GenerationChunk:
    155     final_chunk: Optional[GenerationChunk] = None
--> 156     for stream_resp in self._create_stream(prompt, stop, **kwargs):
    157         if stream_resp:
    158             chunk = _stream_response_to_generation_chunk(stream_resp)

File ~/chatpdf-langchain/langchain/libs/langchain/langchain/llms/ollama.py:140, in _OllamaCommon._create_stream(self, prompt, stop, **kwargs)
    138 response.encoding = "utf-8"
    139 if response.status_code != 200:
--> 140     optional_detail = response.json().get("error")
    141     raise ValueError(
    142         f"Ollama call failed with status code {response.status_code}."
    143         f" Details: {optional_detail}"
    144     )
    145 return response.iter_lines(decode_unicode=True)

File ~/anaconda3/envs/langchain/lib/python3.8/site-packages/requests/models.py:975, in Response.json(self, **kwargs)
    971     return complexjson.loads(self.text, **kwargs)
    972 except JSONDecodeError as e:
    973     # Catch JSON-related errors and raise as requests.JSONDecodeError
    974     # This aliases json.JSONDecodeError and simplejson.JSONDecodeError
--> 975     raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

After attempting:

# Summary chain 
model = ChatOllama(model="llama2:13b-chat")
summarize_chain = {"element": lambda x:x} | prompt | model
texts = [i.text for i in text_elements if i.text != ""]
text_summaries = summarize_chain.batch(texts)

Returned the same error as before.

It seems that the error is occurring in the ‘ChatOllama model’.

Expected behavior

Attempting to reproduce the effects in the Cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb.

🤖

Hello,

Thank you for reaching out and providing detailed information about the issue you're facing. The JSONDecodeError you're encountering typically occurs when there's an issue with the JSON data being parsed. In this case, it seems to be happening when the ChatOllama model is trying to process the response from the Ollama's generate endpoint.

Given that you're using LangChain version 0.0.320, I would first recommend updating to the latest version, which is 0.0.321. There might have been bug fixes or improvements that could potentially resolve the issue you're facing.

You can update LangChain by running the following command:

pip install --upgrade langchain

After updating, please try running your code again to see if the issue persists.

If the error still occurs, it would be helpful to have a bit more information to diagnose the issue. Could you please provide the full traceback of the error? This would give us more context about where exactly the error is happening.

Additionally, it would be helpful to know the exact input you're passing to the ChatOllama model. If there's an issue with the input data, it could potentially cause a JSONDecodeError.

Lastly, please ensure that your Python environment has all the necessary dependencies installed and up-to-date. The ChatOllama model has several dependencies, as mentioned in the context above.

Please let me know if updating LangChain and checking the input data and dependencies help resolve the issue. If not, the additional information requested will help us further investigate the problem.

Best regards.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosu-bot , the issue still persists after upgrading from langchain 0.0.320 to the latest version, 0.0.321.

🤖

Hello,

Here are some steps you can take to debug and potentially resolve the issue:

Check the output variable in the format_response_payload method: This method is responsible for formatting the response from the model. You can add a print statement before the json.loads(output) line to see what the output variable contains. If it's not a valid JSON string, this could be the source of the error.

def format_response_payload(self, output: bytes) -> str:
    """Formats response"""
    print(output)  # Add this line to debug
    return json.loads(output)["output"]

Check the response_payload variable in the _call method: This method is responsible for making the call to the model and receiving the response. You can add a print statement before the self.content_formatter.format_response_payload(response_payload) line to see what the response_payload variable contains.

def _call(
    self,
    messages: List[BaseMessage],
    stop: Optional[List[str]] = None,
    run_manager: Optional[CallbackManagerForLLMRun] = None,
    **kwargs: Any,
) -> str:
    """Call out to an AzureML Managed Online endpoint."""
    _model_kwargs = self.model_kwargs or {}

    request_payload = self.content_formatter._format_request_payload(
        messages, _model_kwargs
    )
    response_payload = self.http_client.call(request_payload, **kwargs)
    print(response_payload)  # Add this line to debug
    generated_text = self.content_formatter.format_response_payload(
        response_payload
    )
    return generated_text

Check the structure of the JSON data: The ChatOllama model is expected to return JSON data with the following structure:

{
    "response": "<response text>",
    "done": <boolean>
}

If the JSON data returned by the model doesn't match this structure, it could be causing the error. You can check this by printing out the keys of the JSON object in the format_response_payload method.

def format_response_payload(self, output: bytes) -> str:
    """Formats response"""
    json_obj = json.loads(output)
    print(json_obj.keys())  # Add this line to debug
    return json_obj["output"]

If the "output" key does not exist in the JSON object, you'll need to modify the code to handle this case.

Please try these debugging steps and let me know what you find. If the issue persists, we can explore further solutions.

Best, Dosu

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

🤖

Hello,

Thank you for reaching out and providing detailed information about the issue you're facing. The JSONDecodeError you're encountering typically occurs when there's an issue with the JSON data that's being parsed. In this case, it seems like the error could be occurring in the ChatOllama model, specifically in the format_response_payload method of the LlamaContentFormatter class or in the _call method of the AzureMLChatOnlineEndpoint class.

Here are some steps to debug and resolve the issue:

Check the output variable in the format_response_payload method. Make sure it is a valid JSON string. You can print it out before the json.loads(output) line to see what it contains.

def format_response_payload(self, output: bytes) -> str:
    """Formats response"""
    print(output)  # Add this line to debug
    return json.loads(output)["output"]

Check the response_payload variable in the _call method. Make sure it is a valid JSON string. You can print it out before the self.content_formatter.format_response_payload(response_payload) line to see what it contains.

def _call(
    self,
    messages: List[BaseMessage],
    stop: Optional[List[str]] = None,
    run_manager: Optional[CallbackManagerForLLMRun] = None,
    **kwargs: Any,
) -> str:
    """Call out to an AzureML Managed Online endpoint.
    Args:
        messages: The messages in the conversation with the chat model.
        stop: Optional list of stop words to use when generating.
    Returns:
        The string generated by the model.
    Example:
        .. code-block:: python
            response = azureml_model("Tell me a joke.")
    """
    _model_kwargs = self.model_kwargs or {}

    request_payload = self.content_formatter._format_request_payload(
        messages, _model_kwargs
    )
    response_payload = self.http_client.call(request_payload, **kwargs)
    print(response_payload)  # Add this line to debug
    generated_text = self.content_formatter.format_response_payload(
        response_payload
    )
    return generated_text

If the JSON string is not valid, you need to check where it is coming from and fix the issue there. It could be that the AzureML Managed Online endpoint is returning an invalid JSON string, or there could be an issue with how the request_payload is being created in the _format_request_payload method.
If the JSON string is valid, but the json.loads(output)["output"] line is still throwing an error, it could be that the JSON object does not have an "output" key. You can check this by printing out the keys of the JSON object.

def format_response_payload(self, output: bytes) -> str:
    """Formats response"""
    json_obj = json.loads(output)
    print(json_obj.keys())  # Add this line to debug
    return json_obj["output"]

If the "output" key does not exist in the JSON object, you need to modify the code to handle this case.

Please try these debugging steps and let me know if you're able to resolve the issue or if you need further assistance.

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

I tried using the ChatOpenAI API, and it worked fine. It seems like there might be a bug in the ChatOllama API. Here is the code I used with the ChatOpenAI API:

from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
prompt_text="""You are an assistant tasked with summarizing tables and text. \ 
Give a concise summary of the table or text. Table or text chunk: {element} """
prompt = ChatPromptTemplate.from_template(prompt_text) 

model = ChatOpenAI()
summarize_chain = {"element": lambda x:x} | prompt | model | StrOutputParser()

# Apply to text
texts = [i.text for i in text_elements if i.text != ""]
text_summaries = summarize_chain.batch(texts, {"max_concurrency": 5})

Hello, I am using similar code to you, but there is always a problem of failing to link libraries in the hugging face during runtime. If it runs successfully, the code execution time will be very long. How can I solve this problem problem:urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /microsoft/table-transformer-structure-recognition/resolve/main/config.json (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:1131)'))) 080e6c76-3e81-4570-9c3e-e63571ac90e0

In case people are struggling with JSONDecodeError: Expecting value: line 1 column 1 (char 0), in my case the issue was that the Ollama API was returning a response that was starting with a space, in the format { (there is a space before {). All I had to do was call .strip() on the response before attempting to JSON decode it.

langchain-ai / langchain