patsonluk / airline

Apache License 2.0
109 stars 71 forks source link

Airport loading deadlock #548

Closed patsonluk closed 1 year ago

patsonluk commented 1 year ago

Due to lazy loading of income and campaigns which airports can have cyclic references:

Found one Java-level deadlock:

"application-akka.actor.default-dispatcher-2": waiting to lock monitor 0x00007fd7ea60e700 (object 0x00000006c36ac070, a com.patson.model.Airport), which is held by "application-akka.actor.default-dispatcher-16" "application-akka.actor.default-dispatcher-16": waiting to lock monitor 0x00007fd79c4a1400 (object 0x00000006c3663278, a com.patson.model.Airport), which is held by "application-akka.actor.default-dispatcher-2"

Java stack information for the threads listed above:

"application-akka.actor.default-dispatcher-2": at com.patson.model.Airport.income$lzycompute(Airport.scala:61)

Found one Java-level deadlock:

"application-akka.actor.default-dispatcher-3": waiting to lock monitor 0x00007fd7ac8e2700 (object 0x00000006d23a5158, a com.patson.model.Airport), which is held by "application-akka.actor.default-dispatcher-18" "application-akka.actor.default-dispatcher-18": waiting to lock monitor 0x00007fd7982a8700 (object 0x00000006cdffecb8, a com.patson.model.Airport), which is held by "application-akka.actor.default-dispatcher-3"

Java stack information for the threads listed above:

"application-akka.actor.default-dispatcher-3": at com.patson.model.Airport.income$lzycompute(Airport.scala:61)

Found 2 deadlocks.

patsonluk commented 1 year ago

The current workaround https://github.com/patsonluk/airline/commit/402575db44112913412e8fd256d0b9139fe70350

A more complete fix might be either breaking the cyclic relationship (airport -> campaign in range -> principle airport) or changing how lazy loading works for Airport - perhaps introduce a new mechanism that lazy load w/o locking the whole class and only on the lazy loaded variable. This does NOT avoid all deadlocks by loading though

Lazy loading cause deadlock is a complicated issue, perhaps we will engineer such solution if we have more deadlocks due to lazy loading. For now the workaround is probably good enough