aio-libs / aiocache

Asyncio cache manager for redis, memcached and memory
http://aiocache.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.13k stars 154 forks source link

Flakiness: TestRedLock.test_locking_dogpile[memcached_cache] #587

Open Dreamsorcerer opened 1 year ago

Dreamsorcerer commented 1 year ago

Often getting this test failure in the CI (only with the memcached backend):

______________ TestRedLock.test_locking_dogpile[memcached_cache] _______________

self = <tests.acceptance.test_lock.TestRedLock object at 0x7fcf1b4a76d0>
mocker = <pytest_mock.plugin.MockerFixture object at 0x7fcf1a492490>
cache = MemcachedCache (127.0.0.1:11211)

    async def test_locking_dogpile(self, mocker, cache):
        mocker.spy(cache, "get")
        mocker.spy(cache, "set")
        mocker.spy(cache, "_add")

        async def dummy():
            res = await cache.get(Keys.KEY)
            if res is not None:
                return res

            async with RedLock(cache, Keys.KEY, lease=5):
                res = await cache.get(Keys.KEY)
                if res is not None:
                    return res
                await asyncio.sleep(0.1)
                await cache.set(Keys.KEY, "value")

        await asyncio.gather(dummy(), dummy(), dummy(), dummy())
        assert cache._add.call_count == 4
        assert cache.get.call_count == 8
>       assert cache.set.call_count == 1, cache.set.call_args_list
E       AssertionError: [call(<Keys.KEY: 'key'>, 'value'), call(<Keys.KEY: 'key'>, 'value')]
E       assert 2 == 1
E        +  where 2 = <function set at 0x7fcf1a4af700>.call_count
E        +    where <function set at 0x7fcf1a4af700> = MemcachedCache (127.0.0.1:11211).set

tests/acceptance/test_lock.py:49: AssertionError

It seems like the most likely cause to me is that on some race condition 2 calls to client.add() succeed, resulting in 2 coroutines entering the lock at the same time. In theory, only one call should be possible to succeed, the other should error because the key already exists. If this is caused by the memcached backend, then maybe we need to do something to work around it and provide a better guarantee that the lock works as expected.

Dreamsorcerer commented 1 year ago

test_cached_stampede[memcached] is also flaky. I suspect it's the same issue.

antont commented 1 year ago

Do you think this is a problem with memcached only, or would happen with redis as well? Am considering using the lib with redis.

Dreamsorcerer commented 1 year ago

redis tests aren't failing randomly, only memcached is causing issues.

antont commented 1 year ago

redis tests aren't failing randomly, only memcached is causing issues.

Ok thanks, I now see that you did mention "only with the memcached backend" already right at start, but maybe it didn't hurt to confirm.

I made a branch yesterday where copied the dogpile solution from this test to our cache decorator, am planning to put it to production soon, after some more testing.