Open cademirch opened 1 year ago
@cademirch Thanks for the report...I don't know exactly what the problem is, but I can describe what the "post-task chown" is meant to do, to give more context that may help narrow it down:
Processes running inside Docker containers very often run as root (uid=0), and as a result, any output files they leave behind on the host filesystem will be owned by root. This is annoying in the common case that the user isn't otherwise routinely operating as root, because they're left with output files that they can't rename or delete unless they sudo
. To avoid this, miniwdl makes each task container chown
all its output files to be owned by the invoking user id, as a postprocessing step in task execution.
The error indicates that the OS rejected this attempt to chown
the container's output files to the invoking user id. So, probably there's something in the configuration of the OS or (shared?) filesystem that prevents chown
ing files between users (even when running as root?). Does that seem plausible?
Singularity and udocker often conform more naturally to these kinds of constraints often imposed in HPC environments.
Hi @mlin, thanks for your detailed reply. I'll try contacting our sysadmin... I'm not sure what it could be. If the chown is happening in the container then I'm not sure what permission issues could be blocking that.
Seems like a duplicate of #404
I am trying to run the viral workflow from the docs.
However the workflow is failing with this error
post-task chown failed: {'Error': None, 'StatusCode': 123}
Here is the whole error.json from _LAST:
I am new to wdl and miniwdl so I'm not quite sure how to debug this. To me this seems like a permissions issue, but I'm not sure where its coming from as miniwdl doesn't seem to have problems reading and writing files in this directory.
Edit: More info
miniwdl run_self_test
2nd Edit: Ran on another server successfully: I was able to run the workflow on fresh cloud instance (Ubuntu 22.04) without this error. This further suggests to me permissions issues on the troublesome server as that is a university maintained machine. Would appreciate advice to solve/debug this!