scylladb / scylla-cqlsh

A fork of the cqlsh code
Apache License 2.0
11 stars 29 forks source link

copyutil: closing the local end of pipes after processes starts #49

Closed fruch closed 10 months ago

fruch commented 10 months ago

CopyTask is is using ReceivingChannels and SendingChannels which are a list of pipes created by CopyTask and those pipes are being send to a list of subprocesses so the main task can communicate with them.

in one dtest that delibratily make once of those child process to break and exit, from time to time we see it getting stuck forever.

the reason is the CopyTask process is hanging on recv call on one of those pipes, since pipes are copy into the child processes there's one fd open on CopyTask and one fd open on child process when the child process closes the fd, recv() doesn't raise EOF since there an open fd that might still send in data.

so we need to close the local pipes on CopyTask after all child processes are started

Fixes: scylladb/scylla-cqlsh#37

fruch commented 10 months ago

I don't know cqlsh and it's not easy to review. Anyway, one question, later in code after start_processes self.inmsg is used in export_records, won't it break it?

Those are pipes that have sockets at both ends of it, we close only the sides which are gonna be used by the subprocess, and they are duplicated when the subprocess starts, so it's safe to close them.

I didn't know much about that area either, but after applying this all cqlsh copy dtests we have are passing, including the one that was getting hung from time to time