pyMBE-dev / pyMBE

pyMBE provides tools to facilitate building up molecules with complex architectures in the Molecular Dynamics software ESPResSo. For an up-to-date API documention please check our website:
https://pymbe-dev.github.io/pyMBE/pyMBE.html
GNU General Public License v3.0
6 stars 8 forks source link

Avoid directly calling the shell from Python scripts #16

Closed jngrad closed 4 months ago

jngrad commented 5 months ago

I would strongly recommend moving away from os.system() (which directly exposes the shell), in favor of safer and more interoperable alternatives, namely subprocess.check_output() (shell=False is default), tempfile.TemporaryDirectory(), os.makedirs(), shutil.move(), glob.glob().

Rules for escaping characters with special meaning in paths (e.g. whitespace, slashes) are cumbersome and easy to forget; subprocess does all that for us. When interrupting a Python script that calls os.system() in a loop (e.g. the testsuite), the interrupt signal is forwarded to the subprocess without affecting the main process; hence the subprocess is interrupted but the main process continues iterating the loop. With subprocess.check_output(), the signal interrupts the subprocess and the main process (via a KeyboardInterrupt that can be caught and gracefully handled by a try ... except clause). When a shell command fails, os.system() silently ignores the issue: it returns an OS-specific error code, and Python continues executing the code; here is a somewhat contrived example:

import os
import subprocess
print("os.system()")
os.system("ls important_file.bak") # is there a backup file?
print("rm important_file")         # oops!
print("subprocess.check_output()")
subprocess.check_output(["ls", "important_file.bak"])
print("rm important_file")         # will never execute without a backup file

Output:

ls: cannot access 'important_file.bak': No such file or directory
rm important_file
ls: cannot access 'important_file.bak': No such file or directory
Traceback (most recent call last):
  File "/home/user/mwe.py", line 5, in <module>
    subprocess.check_output(["ls", "important_file.bak"])
  File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ls', 'important_file.bak']' returned non-zero exit status 2.

Note that subprocess.run() won't raise an exception: it returns an object that wraps the exception and provides more info, such as the error code and the command line arguments, to help the user diagnose the issue. Also, some of the rm path/* commands can probably be replaced with a temporary folder context manager, which is automatically cleaned up when leaving the context manager.

pm-blanco commented 4 months ago

@jngrad I have checked the current main and I think that we have got rid of all calls to os.system in our library during the last PRs. I will inforce the use of subprocess from now on in new developments. If you agree, I would just close this issue.

pm-blanco commented 4 months ago

@jngrad I moved your explanation to the newly funded pyMBE wiki to avoid this knowledge getting lost in the future wiki