When the replication source server is started with an external filter M program and that filter terminates prematurely, it is possible for the source server to go into a spin loop where it invokes the read() system call in repl_filter_recv_line(). In this case, the read() would return a length of 0 (to indicate EOF in the pipe since the filter program that writes to the pipe is gone) but that is not currently handled correctly so we loop back to do the read(). In some cases (if we have sent a full transaction), we start a timer for $ydb_repl_filter_timeout seconds (default of 64 seconds) and that fortunately terminates the loop but if we have not yet sent a full transaction (e.g. the pipe is full and the filter needs to read some data for us to send more), the source server will loop eternally.
Draft Release Note
The replication source server correctly issues a FILTERNOTALIVE error in case it was started with an external filter (using the -FILTER qualifier) that terminates abnormally. Previously, it was possible for the source server to spin-loop waiting for a response from the long-dead filter program.
Final Release Note
Description
When the replication source server is started with an external filter M program and that filter terminates prematurely, it is possible for the source server to go into a spin loop where it invokes the read() system call in repl_filter_recv_line(). In this case, the read() would return a length of 0 (to indicate EOF in the pipe since the filter program that writes to the pipe is gone) but that is not currently handled correctly so we loop back to do the read(). In some cases (if we have sent a full transaction), we start a timer for $ydb_repl_filter_timeout seconds (default of 64 seconds) and that fortunately terminates the loop but if we have not yet sent a full transaction (e.g. the pipe is full and the filter needs to read some data for us to send more), the source server will loop eternally.
Draft Release Note
The replication source server correctly issues a FILTERNOTALIVE error in case it was started with an external filter (using the -FILTER qualifier) that terminates abnormally. Previously, it was possible for the source server to spin-loop waiting for a response from the long-dead filter program.