galaxyproject / pulsar

Distributed job execution application built for Galaxy
https://pulsar.readthedocs.io
Apache License 2.0
37 stars 50 forks source link

Incorrect `html.files_path` with docker/pulsar job #296

Closed neoformit closed 1 year ago

neoformit commented 2 years ago

On some (but not all) pulsar deployments, we found a bug in the creation of the html.files_path directory while running Alphafold jobs on pulsar. Pulsar returns an incorrect value for html.files_path which includes a second working directory level i.e. working/working.


Context

The issue only surfaces when the HTML output makes a request for static assets (PDB files) from the html.files_path. The HTML webpage requests these files with a relative URL (e.g. ranked_0.pdb) but the job directory structure is not what the server expects. Therefore, it fails to locate the file and returns a 404.


Details

I believe that our pulsar job directory should look something like:

tool_script.sh
command.sh
working
├── input.fasta
├── output
|      └── <alphafold outputs>
└── html_files
        └── <html assets>

But html.files_path is created inside an additional working dir:

tool_script.sh
command.sh
working
├── input.fasta
├── output
|      └── <alphafold outputs>
└── working
        └── html_files
                └── <html assets>

Suggestions

When creating the html.files_path, perhaps the interpreter is assumed to be in the job directory but is in fact in the working directory, thereby creating a second child working directory when rendering this path.

Our hacky fix for this issue is this final line in the tool <command> section to correct the html.files_path location, if the bug has occurred:

[ -d working ] && cp -r working/* .
mvdbeek commented 1 year ago

Can you test this with outputs_to_working_directory: false on 23.0 and the last stable version of pulsar ? I think the tests should now be failing if this was still a problem. Feel free to re-open if it isn't fixed.