clemlesne / scrape-it-now

Web scraper made for AI and simplicity in mind. It runs as a CLI that can be parallelized and outputs high-quality markdown content.
Apache License 2.0
485 stars 15 forks source link

Asyncio error `Task was destroyed but it is pending!` #15

Open clemlesne opened 2 months ago

clemlesne commented 2 months ago

Once each minute (5 workers), the error logs spawn:

Impact:

Seems none. Need to investigate.

Short logs:

ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86588' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>

Related:

Full logs:

ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86588' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86623' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86711' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86727' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86739' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86801' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86845' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86874' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86880' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-86945' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87002' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87017' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87024' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87096' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87126' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87182' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87197' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87206' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87254' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87356' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task pending name='Task-87365' coro=<Page._on_route() running at playwright/_impl/_page.py:282> wait_for=<Future pending cb=[Task.task_wakeup()]> cb=[AsyncIOEventEmitter._emit_run.<locals>.callback() at pyee/asyncio.py:69]>
d-balaskas commented 2 months ago

We faced the same error as well this week. Do you have any clue why this happens?

clemlesne commented 1 month ago

It is a warning from asyncio indicating that an asynchronous task has been terminated.

This seems to occur when the page or browser is closed while a load is still in progress (background calls, Ajax, API requests, deferred loading, etc.).

Since the application filters each network call for ads and network monitoring using an asynchronous method, these tasks can be abruptly terminated.

I don’t want to implement a large try/except block for this error because it could originate from other parts of the code, which would increase the complexity of debugging and hinder understanding of the issues.

If you have any suggestions, feel free to share!