Traewelling / traewelling

Free check-in service to log your public transit journeys
https://traewelling.de
GNU Affero General Public License v3.0
224 stars 43 forks source link

[SlowQuery] Deletion of notification #2586

Open MrKrisKrisu opened 1 month ago

MrKrisKrisu commented 1 month ago

Describe the bug

The selective deletion of notifications from the database is very slow, as we work with JSON search parameters here. Another solution must be found here.

Steps to reproduce

# User@Host: xxx[xxx] @ localhost []
# Thread_id: 54900  Schema: traewelling  QC_hit: No
# Query_time: 0.475209  Lock_time: 0.000010  Rows_sent: 0  Rows_examined: 437294
# Rows_affected: 0  Bytes_sent: 0
#
# explain: id   select_type table   type    possible_keys   key key_len ref rows    r_rows  filtered    r_filtered  Extra
# explain: 1    SIMPLE  notifications   ALL NULL    NULL    NULL    NULL    452297  437294.00   100.00  0.00    Using where
#
SET timestamp=xxx;
delete from `notifications` where `type` = 'App\\Notifications\\UserJoinedConnection' and json_unquote(json_extract(`data`, '$."status"."id"')) = xxx;
# User@Host: xxx[xxx] @ localhost []
# Thread_id: 54932  Schema: traewelling  QC_hit: No
# Query_time: 1.955730  Lock_time: 0.000015  Rows_sent: 0  Rows_examined: 437294
# Rows_affected: 0  Bytes_sent: 0
#
# explain: id   select_type table   type    possible_keys   key key_len ref rows    r_rows  filtered    r_filtered  Extra
# explain: 1    SIMPLE  notifications   ALL NULL    NULL    NULL    NULL    452297  437294.00   100.00  0.00    Using where
#
SET timestamp=xxx;
delete from `notifications` where `type` = 'App\\Notifications\\StatusLiked' and json_unquote(json_extract(`data`, '$."status"."id"')) = xxx;

https://github.com/Traewelling/traewelling/blob/eca479b3e1b0975cd0f650b65b1dc092204742ee/app/Observers/FollowObserver.php#L14-L17

HerrLevin commented 1 month ago

What if we just keep the notifications?

MrKrisKrisu commented 1 month ago

What if we just keep the notifications?

This would not be GDPR compliant, as we would also delete notifications from users who delete their account.

HerrLevin commented 1 month ago

Ah. Right. I forgot about that.

We have a few options to improve this slow query. (Just spitting ideas:)

what do you think?

jeyemwey commented 1 month ago

Reduce cost of statement by removing json-queries. We could just slap a LIKE "%statusID/userID% in there

We could even merge both of those where clauses with an AND, but I'm not sure if LIKE wouldn't just also do a full scan which would lock and create similar issues.