langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
89.22k stars 14.06k forks source link

When I call the 'loader()' function of AsyncHtmlLoader, I receive an 'Event loop is closed' error after it completes execution. #10086

Closed jxyer closed 7 months ago

jxyer commented 10 months ago

System Info

LangChain: 0.0.278 python: 3.10 windows10

Who can help?

No response

Information

Related Components

Reproduction

I wrote with reference to this link, and the code is as follows:

from langchain.document_loaders import `AsyncHtmlLoader`

urls = ['https://python.langchain.com/docs/use_cases/web_scraping#asynchtmlloader']
loader = AsyncHtmlLoader(urls)
doc = loader.load()
print(doc)

Return the following error after running:

Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000023EFFD45900>
Traceback (most recent call last):
  File "C:\Users\97994\AppData\Local\Programs\Python\Python310\lib\asyncio\proactor_events.py", line 116, in __del__
    self.close()
  File "C:\Users\97994\AppData\Local\Programs\Python\Python310\lib\asyncio\proactor_events.py", line 108, in close
    self._loop.call_soon(self._call_connection_lost, None)
  File "C:\Users\97994\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 750, in call_soon
    self._check_closed()
  File "C:\Users\97994\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 515, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

Expected behavior

None

dosubot[bot] commented 10 months ago

🤖

Hello,

The 'Event loop is closed' error you're encountering is likely due to the way the asyncio event loop is handled in Python 3.10 on Windows. In Python 3.8 and later, a new event loop policy was introduced for Windows, called the WindowsSelectorEventLoopPolicy, which is different from the previous WindowsProactorEventLoopPolicy. This new policy can cause issues with certain asyncio operations, including the 'Event loop is closed' error.

In the load method of the AsyncHtmlLoader class, the asyncio.run function is used to run the fetch_all coroutine. This function creates a new event loop, runs the given coroutine, then closes the loop. If you're trying to run this function multiple times in the same program, you'll get an 'Event loop is closed' error on the second and subsequent calls, because the event loop used by the first call was closed.

To fix this issue, you can manually create and manage the event loop. Here's how you can modify the load method to do this:

def load(self) -> List[Document]:
    """Load text from the url(s) in web_path."""
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    try:
        results = loop.run_until_complete(self.fetch_all(self.web_paths))
    finally:
        loop.close()

    docs = []
    for i, text in enumerate(results):
        metadata = {"source": self.web_paths[i]}
        docs.append(Document(page_content=text, metadata=metadata))

    return docs

In this modified version, a new event loop is created with asyncio.new_event_loop, and set as the current event loop with asyncio.set_event_loop. Then, the loop.run_until_complete method is used to run the fetch_all coroutine. This method will not close the loop when it's done, so you can call it multiple times in the same program without getting an 'Event loop is closed' error. The loop is manually closed with loop.close in a finally block to ensure it gets closed even if an error occurs.

Please let me know if this resolves your issue or if you need further assistance.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 7 months ago

Hi, @jxyer! I'm Dosu, and I'm here to help the LangChain team manage our backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding, the issue you reported is about the 'loader()' function of AsyncHtmlLoader throwing an 'Event loop is closed' error after completing execution. There hasn't been any response or resolution to the issue yet. However, I have provided a detailed response suggesting a fix for the issue. I recommend manually creating and managing the event loop in the 'load' method of the AsyncHtmlLoader class.

Now, we would like to know if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself. If we don't receive any response within 7 days, the issue will be automatically closed.

Thank you for your understanding and contribution to the LangChain project. If you have any further questions or concerns, please don't hesitate to reach out.