google / exposure-notifications-server

Exposure Notification Reference Server | Covid-19 Exposure Notifications
https://www.google.com/covid19/exposurenotifications/
Apache License 2.0
2.39k stars 306 forks source link

Possible dead lock in the cache package? #1614

Closed jlory closed 1 year ago

jlory commented 1 year ago

Looking at the cache implementation I'm wondering if the code could dead lock in certain conditions.

We aquire a read lock there: https://github.com/google/exposure-notifications-server/blob/90c1deec6cad8694da073e18fa7705c82bad9068/pkg/cache/cache.go#L177

Then we call the purge expired in a goroutine, the purge expired aquires a rw lock : https://github.com/google/exposure-notifications-server/blob/90c1deec6cad8694da073e18fa7705c82bad9068/pkg/cache/cache.go#L204

The code currently work because the runtime schedules purgeExpired goroutine after the defer Unlock from Lookup , but if the goroutine would run before the defer the rw lock would dead lock.

Does it makes sense or am I missing something? is it the reason that it's running in a goroutine actually? so that the we don't dead lock and so the defer unlock executes?

Edit: after some thoughts I think that's the reason why we have a goroutine for purgeExpired, it waits until the defer is executed. Without the goroutine it would be a complete dead lock. Guess you can close that issue hehe.

sethvargo commented 1 year ago

Hi @jlory

I don't believe there's a deadlock here. The expiration happens in a goroutine because we don't want to block the main process from executing while an older item is purged from the cache.

If it wasn't in a goroutine, it would deadlock, but the goroutine doesn't exist to prevent a deadlock (it exists to optimize the way this particular cache is used).