requests-cache / aiohttp-client-cache

An async persistent cache for aiohttp requests
MIT License
122 stars 22 forks source link

Close SQLite connection if session is deleted and thread is still running #189

Closed JWCook closed 1 year ago

JWCook commented 1 year ago

Here is one possible solution for #187. This works based on the logic of aiosqlite.Connection.run() (implementation of Thread.run()). It can be forced to exit without blocking or scheduling any tasks by setting an internal flag and emptying its task queue. The underlying sqlite3.Connection object will not be closed (which is a blocking operation), but in my experience this doesn't cause any problems.

This will only be used in the case where the SQLiteCache object is being garbage-collected and the connection has not yet been closed implicitly (via contextmanager exit) or explicitly (via close()).

This does have the potential for data loss, if somehow the session or cache objects are deleted while async tasks have been created but not awaited. This seems unlikely to happen in practice, and I think it's a reasonable tradeoff, but it does leave some room for possible future improvement.

codecov[bot] commented 1 year ago

Codecov Report

All modified lines are covered by tests :white_check_mark:

Comparison is base (d72bafa) 97.22% compared to head (52f3442) 97.23%. Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #189 +/- ## ========================================== + Coverage 97.22% 97.23% +0.01% ========================================== Files 10 10 Lines 937 942 +5 Branches 172 173 +1 ========================================== + Hits 911 916 +5 Misses 15 15 Partials 11 11 ``` | [Files](https://app.codecov.io/gh/requests-cache/aiohttp-client-cache/pull/189?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=requests-cache) | Coverage Δ | | |---|---|---| | [aiohttp\_client\_cache/backends/sqlite.py](https://app.codecov.io/gh/requests-cache/aiohttp-client-cache/pull/189?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=requests-cache#diff-YWlvaHR0cF9jbGllbnRfY2FjaGUvYmFja2VuZHMvc3FsaXRlLnB5) | `99.13% <100.00%> (+0.03%)` | :arrow_up: |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

aaraney commented 1 year ago

Hey @JWCook! I think this solves most of #187. The thread can still hang if the process receives a SigInt (ctrl + c), but I am not sure how much can be done to avoid that.

I pulled the branch and ran the sqlite integration tests locally and pytest showing a RuntimeWarning being thrown in test_concurrent_bulk_commit. This looks like it has to do with the interaction between the async context manager, self.init_cache(), and bulk_commit_items coroutine which captures the context manager. See below:

pytest test/integration/test_sqlite.py

test/integration/test_sqlite.py::TestSQLiteCache::test_concurrent_bulk_commit
  /home/test/aiohttp-client-cache/sqlite-nocontext-autoclose/aiohttp_client_cache/backends/sqlite.py:116: RuntimeWarning: coroutine 'AsyncMockMixin._execute_mock_call' was never awaited
    self._connection._tx.queue.clear()
  Enable tracemalloc to get traceback where the object was allocated.
  See https://docs.pytest.org/en/stable/how-to/capture-warnings.html#resource-warnings for more info.

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

At least on my machine, wrapping self.init_cache() and bulk_commit_items in a async context manager resolved the issue.

@@ -53,17 +53,24 @@ class TestSQLiteCache(BaseStorageTest):
         mock_connection = AsyncMock()
         mock_sqlite.connect = AsyncMock(return_value=mock_connection)

-        async with self.init_cache() as cache:
+        from contextlib import asynccontextmanager
+
+        @asynccontextmanager
+        async def bulk_commit_ctx():
+            async with self.init_cache() as cache:
+
+                async def bulk_commit_items(n_items):
+                    async with cache.bulk_commit():
+                        for i in range(n_items):
+                            await cache.write(f'key_{n_items}_{i}', f'value_{i}')
+                yield bulk_commit_items

-            async def bulk_commit_items(n_items):
-                async with cache.bulk_commit():
-                    for i in range(n_items):
-                        await cache.write(f'key_{n_items}_{i}', f'value_{i}')
+        async with bulk_commit_ctx() as bulk_commit_items:

-        assert mock_connection.commit.call_count == 1
-        tasks = [asyncio.create_task(bulk_commit_items(n)) for n in [10, 100, 1000, 10000]]
-        await asyncio.gather(*tasks)
-        assert mock_connection.commit.call_count == 5
+            assert mock_connection.commit.call_count == 1
+            tasks = [asyncio.create_task(bulk_commit_items(n)) for n in [10, 100, 1000, 10000]]
+            await asyncio.gather(*tasks)
+            assert mock_connection.commit.call_count == 5

While resolving the above warning, I ended up running pytest with the -s flag to print stdout and noticed in the log output two instances of ERROR:asyncio:Unclosed client session. I figured running pytest with tracemalloc enabled will show where this is occurring. I will look into this later this afternoon and report back.

aaraney commented 1 year ago

While resolving the above warning, I ended up running pytest with the -s flag to print stdout and noticed in the log output two instances of ERROR:asyncio:Unclosed client session. I figured running pytest with tracemalloc enabled will show where this is occurring. I will look into this later this afternoon and report back.

I should have read more closely when reviewing the tests, I now see the comment in the test_without_contextmanager test:

    async def test_without_contextmanager(self):
        """Test that the cache backend can be safely used without the CachedSession contextmanager.
        An "unclosed ClientSession" warning is expected here, however.
JWCook commented 1 year ago

The thread can still hang if the process receives a SigInt (ctrl + c), but I am not sure how much can be done to avoid that.

I was thinking about that as well. Options would include handling KeyboardInterrupt in CachedSession.__aenter__(), or adding a custom handler for SIGINT. I would prefer to avoid adding that until that specific case becomes an issue for you (or someone else).

pytest showing a RuntimeWarning being thrown in test_concurrent_bulk_commit

I wasn't able to reproduce this locally, but this makes sense, so I applied your patch. Thanks for the suggestion!