derekkraan / horde

Horde is a distributed Supervisor and Registry backed by DeltaCrdt
MIT License
1.32k stars 106 forks source link

Fix crashes related to missing child ids in lookup table #266

Closed arjan closed 9 months ago

arjan commented 1 year ago

I noticed that when a cluster is restarting, Horde tends to crash because it does some administration that is incorrect. The child lookup for the linked ETS table can fail but this was not handled on all places in the Dynamic Supervisor implementation.

This fix addresses three places that resulted in crashes because nil was not handled from the get_item / pop_item calls.

Note that there were several other places where this was already addressed.

kinson commented 10 months ago

@arjan thanks for creating this pr, we're running into a similar issue.

@derekkraan is there something I can do to help get this pr to a point where you feel comfortable merging it?

derekkraan commented 9 months ago

@kinson I need to find time to look at this PR, nothing you can do at the moment to speed things up.

arjan commented 9 months ago

Fwiw I have been running this in production for quite some time now without noticable issues...

https://github.com/botsquad/horde/tree/integration

derekkraan commented 9 months ago

Thanks @arjan for this and for the other PR. As usual, apologies for taking so long to get around to merging them. I suppose a 0.9 release is in order?

arjan commented 9 months ago

Fwiw I have been running this in production for quite some time now without noticable issues...

https://github.com/botsquad/horde/tree/integration

arjan commented 9 months ago

Not sure, technically these are bugfixes i guess?

derekkraan commented 9 months ago

These are, but there are breaking changes in master that necessitate a 0.9.

kinson commented 9 months ago

As usual, apologies for taking so long to get around to merging them.

@derekkraan no problem, thanks for merging and releasing this 🚀