Parsl / parsl

Parsl - a Python parallel scripting library
http://parsl-project.org
Apache License 2.0
479 stars 194 forks source link

serialised globus exception does not deserialise correctly: `__init__() missing 1 required positional argument: 'exc'` #785

Open benclifford opened 5 years ago

benclifford commented 5 years ago

With commit 74df7017a09a18d05bb5a39fdf57ae76303691af on master:

2019-02-23 00:25:41 parsl.data_provider.globus:149 [WARNING]  Non-critical Globus Transfer error event for globus://005fe85c-36da-11e9-9838-0262a1f2f698/home/benc/parsl/workdir2/sorted.txt: "file not found" at 2019-02-23 00:25:04+00:00. Retrying...
2019-02-23 00:25:41 parsl.data_provider.globus:149 [WARNING]  Non-critical Globus Transfer error event for globus://005fe85c-36da-11e9-9838-0262a1f2f698/home/benc/parsl/workdir2/sorted.txt: "file not found" at 2019-02-23 00:25:04+00:00. Retrying...
2019-02-23 00:25:41 parsl.data_provider.globus:150 [DEBUG]  Globus Transfer error details: {
  "context": [
    {
      "operation": "Directory List / File Scan",
      "path": "/home/benc/parsl/workdir2/sorted.txt"
    }
  ],
  "error": {
    "body": "550-GlobusError: v=1 c=PATH_NOT_FOUND\\r\\n550-GridFTP-Errno: 2\\r\\n550-GridFTP-Reason: System error in stat\\r\\n550-GridFTP-Error-String: No such file or directory\\r\\n550 End.\\r\\n",
    "code": 550,
    "endpoint": "my laptop 2nd work location (005fe85c-36da-11e9-9838-0262a1f2f698)",
    "server": "gsiftp://172.20.44.170:37028",
    "type": "FTPServerError"
  }
}
2019-02-23 08:54:02 parsl.dataflow.dflow:261 [ERROR]  Task 2 failed
Traceback (most recent call last):
  File "/home/benc/parsl/src/parsl/parsl/dataflow/dflow.py", line 258, in handle_exec_update
    res.reraise()
  File "/home/benc/parsl/src/parsl/parsl/app/errors.py", line 149, in reraise
    reraise(dill.loads(self.e_type), dill.loads(self.e_value), self.e_traceback.as_traceback())
  File "/home/benc/parsl/virtualenv/lib/python3.5/site-packages/dill/_dill.py", line 317, in loads
    return load(file, ignore)
  File "/home/benc/parsl/virtualenv/lib/python3.5/site-packages/dill/_dill.py", line 305, in load
    obj = pik.load()
TypeError: __init__() missing 1 required positional argument: 'exc'
2019-02-23 08:54:02 parsl.dataflow.dflow:285 [INFO]  Task 2 failed after 0 retry attempts
0
benclifford commented 5 years ago

This looks like the same problem as in #548, for the globus SDK class globus_sdk.exc.GlobusTimeoutError which at some point calls a superclass __init__ in a way that causes trouble: see the description #548 and https://bugs.python.org/issue1692335#msg52351

The solution for #548 was to remove the offending call.

The solution for this might be the same, or not - depending on whether globus_sdk people believe that their exceptions should support pickling.

sirosen commented 2 years ago

The solution for this might be the same, or not - depending on whether globus_sdk people believe that their exceptions should support pickling.

I've only just learned about this issue, but I'm afraid that pickling the SDK errors and responses is tricky business. In each case, there may be an underlying requests object, which we want to preserve if possible. requests.Response does not pickle cleanly, and other objects may also cause trouble.

I've done some work in this space in the distant past, and I'm not sure we're prepared to fully guarantee that an unpickled response will act the same as the original. But we can take steps to try to make sure that the object can make it through -- even if we need to do things like setting self._response = None. I'll think about this more.