Open i30817 opened 5 years ago
Can you provide a minimum example that someone else can run that demonstrates the problem you're talking about? (I for one am very confused by your description and suggested solution.)
Additionally, could you show how your problem is mitigated in some other environment? (e.g., via argparse?)
Ok, this problem normally exists from using named pipes.
@contextmanager
def named_pipes(n=1):
dirname = tempfile.mkdtemp()
try:
paths = [os.path.join(dirname, 'romhack_pipe' + str(i)) for i in range(n)]
for path in paths:
os.mkfifo(path)
yield paths
finally:
shutil.rmtree(dirname)
def get_checksums():
hash_md5 = hashlib.md5()
hash_sha1 = hashlib.sha1()
hash_crc32 = 0
size = 0
buf = yield
while len(buf) > 0:
size += len(buf)
hash_md5.update(buf)
hash_sha1.update(buf)
hash_crc32 = zlib.crc32(buf, hash_crc32)
buf = yield
crc = '{:08x}'.format( hash_crc32 & 0xffffffff )
md5 = hash_md5.hexdigest()
sha1 = hash_sha1.hexdigest()
yield (size, crc, md5, sha1)
def producer(arguments, generator_function):
''' will append a output fifo to the end of the argument list prior to
applying the generator function to that fifo. Make sure the command
output is setup for the last argument to be a output file in the arguments
'''
BLOCKSIZE = 2 ** 20
process = None
next(generator_function)
with named_pipes() as pipes:
pipe = pipes[0]
arguments.append(pipe)
with subprocess.Popen(arguments,
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL) as process:
with open(pipe, 'rb') as fifo:
byt = fifo.read(BLOCKSIZE)
while len(byt) > 0:
generator_function.send(byt)
byt = fifo.read(BLOCKSIZE)
#if a error occurred avoid writing bogus checksums
if process.returncode != 0:
raise NonFatalError('error during patching')
return generator_function.send([])
In the code above, the with open(pipe, 'rb') as fifo:
can hang if the subprocess quit with a error without having opened the output file; ie: it never took part in the fifo protocol, so the caller is waiting forever. The subprocess just wrote a error to stderr (usually) and went away without touching the 'output file' it was given.
This can be avoided with a 'trick' using system anonymous pipes that happen to have a 'fake but real' filesystem interface in linux, and also gives much better code:
def get_checksums():
hash_md5 = hashlib.md5()
hash_sha1 = hashlib.sha1()
hash_crc32 = 0
size = 0
buf = yield
while len(buf) > 0:
size += len(buf)
hash_md5.update(buf)
hash_sha1.update(buf)
hash_crc32 = zlib.crc32(buf, hash_crc32)
buf = yield
crc = '{:08x}'.format( hash_crc32 & 0xffffffff )
md5 = hash_md5.hexdigest()
sha1 = hash_sha1.hexdigest()
yield (size, crc, md5, sha1)
def producer(arguments, generator_function):
''' will append a output fifo to the end of the argument list prior to
applying the generator function to that fifo. Make sure the command
output is setup for the last argument to be a output file in the arguments
'''
#read and patchers write to the error stream so if there is a error, the pipe still gets notified
arguments.append("/dev/stderr") #not portable to windows (maybe with CON but i'd have to test it)
next(generator_function)
with subprocess.Popen(arguments, stdout=subprocess.DEVNULL, stderr=subprocess.PIPE) as process:
for byt in process.stderr:
generator_function.send(byt)
#if a error occurred avoid writing bogus checksums
if process.returncode != 0:
raise NonFatalError('error during patching')
return generator_function.send([])
Notice the vanishment of the named pipe context manager and the much simplified producer.
In the above, there is no chance of a deadlock of that other nature because the code is making sure that the subprocess output 'file' is its own stderr, therefore although a 'error' will corrupt the stream, we don't want it in that case, and as it is something that every process opens, when it exits it'll close it and signal the reader to stop waiting and see the return value.
Anyway i was thinking that encouraging people that do cli apps to get a already opened file handle like argparse parser.add_argument('-o', metavar=('output-file'), type=FileType('a+', encoding='utf-8'))
appears to do would make the number of apps that can work with named fifos increase, by making the easier path to have a open file already and it'd be closed on process exit.
I don't know what happens in the case where the 'destination' is overwritten by a move on the end (this is common for programs that try to support trying to support the same file as input and output). Probably a failure of the 'stream' but not a hang.
One thing i thought was a problem that wasn't actually: i thought mkfifo was portable (it isn't, so might as well use dev/stderr). There isn't actually a portable way to do this kind of 'almost subprocess code agnostic' interprocess communication in windows and linux/unix, so it's kind of discouraging.
I can't count the number of times i had this sequence
I was thinking the WG could 'encourage' this not to happen by providing a argparse equivalent where the easiest way to code a cli app ends with output files 'already open when you get them' and thus closed automatically on process end. Does this make sense?