langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.19k stars 15.23k forks source link

deepcopy is extremely slow. Can we define a copy function? #21001

Closed Huarong closed 5 months ago

Huarong commented 6 months ago

Checked other resources

Example Code

No particular code.

Error Message and Stack Trace (if applicable)

No response

Description

copy.deepcopy is known to be extremely slow. When we create complex agents in langgraph, it slows things down a lot.

A solution is to define a function to copy only what necessary.

https://github.com/langchain-ai/langchain/blob/804390ba4bcc306b90cb6d75b7f01a4231ab6463/libs/core/langchain_core/tracers/log_stream.py#L105

https://github.com/langchain-ai/langchain/blob/804390ba4bcc306b90cb6d75b7f01a4231ab6463/libs/core/langchain_core/tracers/log_stream.py#L590

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:34 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8103 Python Version: 3.11.2 (main, Feb 21 2024, 12:24:36) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.1.46 langchain: 0.1.16 langchain_community: 0.0.34 langsmith: 0.1.22 langchain_openai: 0.0.6 langchain_text_splitters: 0.0.1 langgraph: 0.0.39

Packages not installed (Not Necessarily a Problem)

The following packages were not found:

langserve

eyurtsev commented 6 months ago

@Huarong could you provide benchmarking results that provide that this is an actual bottleneck? Deepcopy may be slow, but it's a bit surprising that this would be a bottleneck at all given the latency of LLMs.

Are you using this with astream_log or astream_events API?

Huarong commented 6 months ago

@Huarong could you provide benchmarking results that provide that this is an actual bottleneck? Deepcopy may be slow, but it's a bit surprising that this would be a bottleneck at all given the latency of LLMs.

Are you using this with astream_log or astream_events API?

@eyurtsev I'm using with astream_events API. If I understand correctly, there are so many chunks in astream_events that deepcopy maybe called multiple times.

I'm using it in langgraph whose chunk will contain a state dict which maybe a litter big.

So, multiple times of calling deepcopy and big chunk with state maybe result in slow the whole run.

I will provide a benchmark code later.

eyurtsev commented 6 months ago

@Huarong great! We're likely going to re-work how the data for astream events is piped through to remove the dependency on json patch entirely. That should resolve your issue if you're relying on the astream_events API.

Nonetheless, if you are able to benchmark and show that it takes a substantial portion of the run time, let us know -- this will be good for us to know

Huarong commented 5 months ago

@eyurtsev The code to benchmark.

import asyncio
from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph import StateGraph
from langgraph.graph.message import add_messages
from langchain_openai import ChatOpenAI

class State(TypedDict):
    # Messages have the type "list". The `add_messages` function
    # in the annotation defines how this state key should be updated
    # (in this case, it appends messages to the list, rather than overwriting them)
    messages: Annotated[list, add_messages]

graph_builder = StateGraph(State)

llm = ChatOpenAI()

async def chatbot(state: State):
    return {"messages": [await llm.ainvoke(state["messages"])]}

def post_process(state: State):
    return state

def pre_process(state: State):
    return state

def start(state: State):
    return state

def end(state: State):
    return state

# The first argument is the unique node name
# The second argument is the function or object that will be called whenever
# the node is used.
graph_builder.add_node("Start", start)
graph_builder.add_node("pre_process", pre_process)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_node("post_process", post_process)
graph_builder.add_node("End", end)

graph_builder.add_edge("Start", "pre_process")
graph_builder.add_edge("pre_process", "chatbot")
graph_builder.add_edge("chatbot", "post_process")
graph_builder.add_edge("post_process", "End")

graph_builder.set_entry_point("Start")
graph_builder.set_finish_point("End")
graph = graph_builder.compile()

sun_story = """
Once upon a time, in a universe far, far away, the sun was the center of a bustling solar system. At its center was the star, a magnificent entity that shone with an almost blinding radiance. The star was surrounded by a family of celestial bodies: nine planets, each with its own unique features and characteristics, and countless asteroids, comets, and other smaller bodies.

The sun was a force of nature, a colossal fusion reactor, generating energy by converting the mass of hydrogen into helium. This process created the light and heat that bathed the solar system in a warm, golden glow. The planets revolved around the sun in their own orbits, each one experiencing unique variations in temperature and atmosphere due to their distance from the radiant center.

The sun was also a nurturing force. It gave life to the planets, providing the warmth and energy necessary for life to flourish. On a particular planet, Earth, the sun's rays played a crucial role in the delicate balance of nature. They enabled photosynthesis, the process by which plants convert sunlight into energy. This energy was then passed on to other living creatures through the food chain, sustaining ecosystems and allowing life to thrive.

However, the sun was not invincible. Over billions of years, it gradually grew larger and more luminous, a process known as stellar evolution. This change brought both benefits and challenges to the solar system. The increased light and warmth from the sun allowed new forms of life to emerge on Earth, while the growing size of the star threatened to engulf the inner planets in the distant future.

As the sun continued to age, it would eventually exhaust its supply of hydrogen and begin to evolve into a more massive and luminous star. It would eventually expand into a red giant, engulfing the inner planets and potentially scattering life throughout the solar system. This process was a reminder of the sun's immense power and the delicate balance of life that it helped maintain.

And so, the sun continued its journey through the cosmos, providing light and warmth to the celestial bodies that orbited around it. Its story was one of cosmic power and the delicate balance of life, a testament to the beauty and fragility of existence in the vast and mysterious universe.
"""
moon_story = """
Once upon a time, in a far-off land, there was a young boy named Luna who lived in a small village nestled beneath the light of the full moon. Luna was an orphan, having lost both of his parents at a very young age. Despite his difficult circumstances, Luna was an optimistic and kind-hearted boy who was always eager to help others.

One day, while tending to his family's small farm, Luna noticed that the moon seemed to be following him wherever he went. At first, he thought it was just his imagination, but the moon continued to shine brightly over him, even when he was inside his home. Luna was fascinated by this strange phenomenon and began to spend his evenings gazing up at the moon, wondering what it could possibly mean.

As the days passed, Luna noticed that the moon seemed to be growing larger and more luminous. He began to hear whispers and stories from the other villagers about the moon and the mystical creatures that lived within its light. Luna was skeptical at first, but the more he observed the moon, the more he felt a connection to it.

One night, while gazing up at the full moon, Luna felt a strange sensation in his chest. Suddenly, he found himself floating above the village, hovering in the air next to the moon. Luna was amazed and terrified at the same time, but as he looked closer at the moon, he saw a small door opening in its surface. Without hesitation, Luna stepped through the door and found himself in a beautiful, mystical world.

The moon was alive and filled with magical creatures and mystical beings. Luna explored this wondrous realm, meeting new friends and learning about the power of the moon and its connection to the natural world. Luna spent many days and nights in this enchanted world, learning from the creatures who dwelled there.

Eventually, Luna returned to his village on Earth, but he was forever changed by his experience. He became known as the "Moon Boy" and shared his stories with others, hoping to inspire them to look up and see the beauty of the moon and its connection to the world around us. Luna spent the rest of his days helping others and protecting the natural world, guided by the power of the moon.
"""

messages = [
    ("user", "hi"),
    ("assistant", "Hello! How are you today?"),
    ("user", "tell me a story about the sun"),
    ("assistant", sun_story),
    ("user", "another story about the moon"),
    ("assistant", moon_story),
    ("user", "what about the story of the star?"),
]

async def main():
    async for event in graph.astream_events({"messages": messages}, version="v1"):
        pass
        # print(event)

asyncio.run(main())

Profiling by pyinstrument:

pyinstrument test_basic_chatbot.py

profile report: image

System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 23.2.0: Wed Nov 15 21:53:34 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T8103
> Python Version:  3.11.2 (main, Feb 21 2024, 12:24:36) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information
-------------------
> langchain_core: 0.1.52
> langchain: 0.1.17
> langchain_community: 0.0.37
> langsmith: 0.1.54
> langchain_openai: 0.1.6
> langchain_text_splitters: 0.0.1
> langchainhub: 0.1.15
> langgraph: 0.0.45

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langserve

deepcopy spent about 2 seconds which is pretty slow. When I add more nodes to langgraph, define more complex State or longer history messages, it would speed much more time.

eyurtsev commented 5 months ago

We now have a stream events V2 which no longer relies on jsonpatch. Do you want to take it for a spin?

Huarong commented 5 months ago

We now have a stream events V2 which no longer relies on jsonpatch. Do you want to take it for a spin?

@EricLiclair It works great! Thank you.