rabbitmq / ra

A Raft implementation for Erlang and Elixir that strives to be efficient and make it easier to use multiple Raft clusters in a single system.
Other
798 stars 93 forks source link

function_clause in terminate when ra_server_proc async init exits #398

Closed gomoripeti closed 2 months ago

gomoripeti commented 9 months ago

Describe the bug

In case when init is performed in an async way (https://github.com/rabbitmq/ra/blob/main/src/ra_server_proc.erl#L265) there is a Config map instead of a #state{} record in the gen_statem state. This is unexpected for terminate/3 when post_init/do_init throughs an exception. We've seen the below crash with ra 2.6.3

errorContext: child_terminated
reason: {function_clause,
         [{ra_server_proc,terminate,
           [{corrupt_log,gap_between_snapshot_and_first_index,
             {3555151,3555327}},
            post_init,
            #{await_condition_timeout => 30000,broadcast_time => 100,
...

Not sure if this is just an aesthetic issue (the server_porc would terminate anyway) or if there is any cleanup missed because terminate is not executed.

Reproduction steps

Not sure how to manually trigger an exception (corrupt_log or other). This was seen in production.

Expected behavior

No function_clause error - terminate should handle exceptions in async init scenario.

Additional context

No response

kjnilsson commented 2 months ago

I think this is fixed here: https://github.com/rabbitmq/ra/blob/main/src/ra_server_proc.erl#L1004

kjnilsson commented 2 months ago

@gomoripeti if not please reopen.

gomoripeti commented 2 months ago

Great, thanks for the heads up

Karl Nilsson @.***> ezt írta (időpont: 2024. ápr. 24., Sze 17:22):

@gomoripeti https://github.com/gomoripeti if not please reopen.

— Reply to this email directly, view it on GitHub https://github.com/rabbitmq/ra/issues/398#issuecomment-2075206581, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA4NVJ2RHJAOQ5TR7OMCNHLY67E4RAVCNFSM6AAAAAA5VJ2XO2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZVGIYDMNJYGE . You are receiving this because you were mentioned.Message ID: @.***>