element-hq / synapse

Synapse: Matrix homeserver written in Python/Twisted.
https://element-hq.github.io/synapse
GNU Affero General Public License v3.0
1.53k stars 186 forks source link

@cachedList can insert None instead of absence of a key #17726

Open kegsay opened 1 month ago

kegsay commented 1 month ago

The docs for this state:

Note that any missing values are cached as None.

This is not semantically the same as absence of a key, and creates very subtle failure modes. I just spent an hour or two trying to figure out how a function could possibly set None and well, it can't. It turns out the cache is doing it because some rooms don't have entries in the DB. The function in question is https://github.com/element-hq/synapse/blob/51dd4df0a317330d0679e48d7a6dcd5abb054ec7/synapse/storage/databases/main/stream.py#L1495

Unfortunately, the caller of this function never did a None check causing runtime failures: see https://github.com/element-hq/element-x-ios/issues/3300#issuecomment-2358222041

It is absolutely not clear that adding a cache decorator would change the semantics like this. I suggest we double check every caller of these functions to make sure they None check before usage.

synapse/storage/databases/main/end_to_end_keys.py:855:    @cachedList(
synapse/storage/databases/main/end_to_end_keys.py-856-        cached_method_name="_get_bare_e2e_cross_signing_keys",
--
synapse/storage/databases/main/receipts.py:441:    @cachedList(
synapse/storage/databases/main/receipts.py-442-        cached_method_name="_get_linearized_receipts_for_room",
--
synapse/storage/databases/main/roommember.py:199:    @cachedList(
synapse/storage/databases/main/roommember.py-200-        cached_method_name="get_user_in_room_with_profile", list_name="user_ids"
--
synapse/storage/databases/main/roommember.py:734:    @cachedList(
synapse/storage/databases/main/roommember.py-735-        cached_method_name="get_rooms_for_user",
--
synapse/storage/databases/main/roommember.py:793:    @cachedList(
synapse/storage/databases/main/roommember.py-794-        cached_method_name="does_pair_of_users_share_a_room", list_name="other_user_ids"
--
synapse/storage/databases/main/roommember.py:942:    @cachedList(
synapse/storage/databases/main/roommember.py-943-        cached_method_name="_get_user_id_from_membership_event_id",
--
synapse/storage/databases/main/roommember.py:1284:    @cachedList(
synapse/storage/databases/main/roommember.py-1285-        cached_method_name="_get_membership_from_event_id", list_name="member_event_ids"
--
synapse/storage/databases/main/transactions.py:211:    @cachedList(
synapse/storage/databases/main/transactions.py-212-        cached_method_name="get_destination_retry_timings", list_name="destinations"
--
synapse/storage/databases/main/relations.py:477:    @cachedList(cached_method_name="get_references_for_event", list_name="event_ids")
synapse/storage/databases/main/relations.py-478-    async def get_references_for_events(
--
synapse/storage/databases/main/relations.py:532:    @cachedList(cached_method_name="get_applicable_edit", list_name="event_ids")  # type: ignore[synapse-@cached-mutable]
synapse/storage/databases/main/relations.py-533-    async def get_applicable_edits(
--
synapse/storage/databases/main/relations.py:619:    @cachedList(cached_method_name="get_thread_summary", list_name="event_ids")  # type: ignore[synapse-@cached-mutable]
synapse/storage/databases/main/relations.py-620-    async def get_thread_summaries(
--
synapse/storage/databases/main/relations.py:793:    @cachedList(cached_method_name="get_thread_participated", list_name="event_ids")
synapse/storage/databases/main/relations.py-794-    async def get_threads_participated(
--
synapse/storage/databases/main/keys.py:137:    @cachedList(
synapse/storage/databases/main/keys.py-138-        cached_method_name="_get_server_keys_json", list_name="server_name_and_key_ids"
--
synapse/storage/databases/main/keys.py:207:    @cachedList(
synapse/storage/databases/main/keys.py-208-        cached_method_name="get_server_key_json_for_remote", list_name="key_ids"
--
synapse/storage/databases/main/presence.py:253:    @cachedList(
synapse/storage/databases/main/presence.py-254-        cached_method_name="_get_presence_for_user",
--
synapse/storage/databases/main/events_worker.py:1624:        # @cachedList chomps lots of memory if you call it with a big list, so
synapse/storage/databases/main/events_worker.py-1625-        # we break it down. However, each batch requires its own index scan, so we make
--
synapse/storage/databases/main/events_worker.py:1639:    @cachedList(cached_method_name="have_seen_event", list_name="event_ids")
synapse/storage/databases/main/events_worker.py-1640-    async def _have_seen_events_dict(
--
synapse/storage/databases/main/events_worker.py:2329:    @cachedList(cached_method_name="is_partial_state_event", list_name="event_ids")
synapse/storage/databases/main/events_worker.py-2330-    async def get_partial_state_events(
--
synapse/storage/databases/main/events_worker.py:2352:        # convert the result to a dict, to make @cachedList work
synapse/storage/databases/main/events_worker.py-2353-        partial = {r[0] for r in result}
--
synapse/storage/databases/main/stream.py:1495:    @cachedList(cached_method_name="_get_max_event_pos", list_name="room_ids")
synapse/storage/databases/main/stream.py-1496-    async def _bulk_get_max_event_pos(
--
synapse/storage/databases/main/signatures.py:41:    @cachedList(
synapse/storage/databases/main/signatures.py-42-        cached_method_name="get_event_reference_hash", list_name="event_ids", num_args=1
--
synapse/storage/databases/main/push_rule.py:262:    @cachedList(cached_method_name="get_push_rules_for_user", list_name="user_ids")
synapse/storage/databases/main/push_rule.py-263-    async def bulk_get_push_rules(
--
synapse/storage/databases/main/devices.py:1076:    @cachedList(
synapse/storage/databases/main/devices.py-1077-        cached_method_name="get_device_list_last_stream_id_for_remote",
--
synapse/storage/databases/main/room.py:1361:    @cachedList(cached_method_name="is_partial_state_room", list_name="room_ids")
synapse/storage/databases/main/room.py-1362-    async def is_partial_state_room_batched(
--
synapse/storage/databases/main/state.py:314:    @cachedList(cached_method_name="get_room_type", list_name="room_ids")
synapse/storage/databases/main/state.py-315-    async def bulk_get_room_type(
--
synapse/storage/databases/main/state.py:393:    @cachedList(cached_method_name="get_room_encryption", list_name="room_ids")
synapse/storage/databases/main/state.py-394-    async def bulk_get_room_encryption(
--
synapse/storage/databases/main/state.py:605:    @cachedList(
synapse/storage/databases/main/state.py-606-        cached_method_name="_get_state_group_for_event",
--
synapse/storage/databases/main/user_erasure_store.py:48:    @cachedList(cached_method_name="is_user_erased", list_name="user_ids")
synapse/storage/databases/main/user_erasure_store.py-49-    async def are_users_erased(self, user_ids: Iterable[str]) -> Mapping[str, bool]:
clokep commented 1 month ago

It could probably use a sentinel instead of None to know if it uncached?

clokep commented 1 month ago

Maybe the mypy plugin is missing this coercion?

clokep commented 1 month ago

https://github.com/element-hq/synapse/blob/51dd4df0a317330d0679e48d7a6dcd5abb054ec7/scripts-dev/mypy_synapse_plugin.py#L51-L57C20

That should probably handle cachedList too?

MadLittleMods commented 1 month ago

It could probably use a sentinel instead of None to know if it uncached?

That's what I had to do for bulk_get_room_type and bulk_get_room_encryption in https://github.com/element-hq/synapse/pull/17450

Related discussion: https://github.com/element-hq/synapse/pull/17450#discussion_r1678786409

erikjohnston commented 1 month ago

I think the historical context for this is that it was very much expected that a value was returned for every key, i.e. the bulk fetch function should return the exact same thing as just calling the non-bulk function in a loop. The common case for store functions that couldn't find a row is to return None, and so that is likely why we treated missing values as None.

However, there are now quite a few functions that don't actually have a non-bulk equivalent, and so it could make sense for those functions not to return everything.

Agreed this is a footgun, and we probably should make @cachedList error if the function doesn't return an entry for each item. I'm not super keen on adding sentinel values in there by default, as it breaks the idea that @cachedList should behave like calling @cached function in a loop