Closed fruch closed 10 months ago
I don't know cqlsh and it's not easy to review. Anyway, one question, later in code after
start_processes
self.inmsg
is used inexport_records
, won't it break it?
Those are pipes that have sockets at both ends of it, we close only the sides which are gonna be used by the subprocess, and they are duplicated when the subprocess starts, so it's safe to close them.
I didn't know much about that area either, but after applying this all cqlsh copy dtests we have are passing, including the one that was getting hung from time to time
CopyTask is is using ReceivingChannels and SendingChannels which are a list of pipes created by CopyTask and those pipes are being send to a list of subprocesses so the main task can communicate with them.
in one dtest that delibratily make once of those child process to break and exit, from time to time we see it getting stuck forever.
the reason is the CopyTask process is hanging on
recv
call on one of those pipes, since pipes are copy into the child processes there's one fd open on CopyTask and one fd open on child process when the child process closes the fd,recv()
doesn't raise EOF since there an open fd that might still send in data.so we need to close the local pipes on CopyTask after all child processes are started
Fixes: scylladb/scylla-cqlsh#37