Open code4dna opened 4 years ago
I have seen this in the past, for instance with the alevin tool from @bgruening I think. In the end we had to take out the resources definition, would be good to nail it to be able to get resources selection back!
I have seen this with versions as old as 19.05 or even older possibly.
The toolshed minimap2 works fine in our 20.05 installation with job resource params so I'll be able to compare the two installations. This may be related to #10267 I've tried each of the following values in galaxy.yml for our latest install but they all produce the same error:
#job_resource_params_file: job_resource_params_conf.xml
job_resource_params_file: job_resource_params_conf.xml
job_resource_params_file: config/job_resource_params_conf.xml
I checked out the commit for today's usegalaxy.org but the error still occurs.
commit 04882252b3c87ea31f2979d1cc64ebe0f429c15a
Date: Mon Sep 28 19:27:21 2020 +0200
I am using these same three files that work for 20.05:
lib/galaxy/jobs/rules/map_resources.py
config/job_conf.xml
config/job_resource_params_conf.xml
I notice in the galaxy.log that map_resources.py isn't run since there are no log messages from map_resources.py. It seems that some tools cannot find config/job_resource_params_conf.xml since the error message is similar to #10267
This would appear to be the same problem as https://github.com/galaxyproject/galaxy/issues/9599. No clue what is causing it though. Could it be an issue with datatypes and implicit converters?
Looks like it is related to #9599 and was fixed for a while at least it works in our 20.05 install but now it fails in 20.09.rc1. The tools that are failing, bwa, bowtie2 and minimap2, all have an option to create a reference genome index from a fasta file in the history. If the reference genome sequence file is compressed (fasta.gz) then bwa, bowtie2 and minimap2 fail but if the reference genome is not compressed, then bwa, bowtie2 and minimap2 will align successfully without errors even if the sample fastq files are compressed.
In 20.05 and 20.09.rc1, the Edit attributes: Convert datatype from compressed file to uncompressed fails with a short message:
__sq__cores__sq__
If I switch the group_id order in job_conf.xml from cores,time to time,cores
<resources default="cores_time">
<group id="cores_time">time,cores</group>
</resources>
Then the error message is:
__sq__time__sq__
Although the alignment tools are working with fasta.gz reference genomes for indexing in 20.05.
It seems that the format converters are looking for the job_resource_params_conf.xml file but can't find it since the error message is similar to #10267 which is the error message when the job_resource_params_conf.xml file is not found. It also seems that the converters do not need to look for and use the job_resource_params_conf.xml, except maybe if the conversion needs to occur on a remote configured system during a job. If the converters searching for the job_resource_params_conf.xml file can be disabled, it may fix this issue at least for cores, time, memory resources.
I'm looking into #10267 at the moment; this may or may not be related to a config issue (job_resource_params_file
shouldn't have to be prefixed with config/
in 20.09).
I can accurately define this issue now. We have job_conf.xml configured to enable 'Job Resource Parameters' on every tool:
<resources default="cores_time">
<group id="cores_time">cores,time</group>
</resources>
This way we don't have to add a line to job_conf.xml for each newly installed tool that we want to use resource params. Using this approach also causes the converter tools to look for the job_resource_params_conf.xml which they can't locate which is what is failing. As a test, I changed our config to allow 'Job Resource Parameters' on just the bwa tool and all works well and bwa will run successfully using job resource params when the genome fasta file is compressed. Additionally, converting a file datatype from compressed file to uncompressed works.
<destination id="dest_cores_time" runner="dynamic">
<param id="type">python</param>
<param id="function">dynamic_cores_time</param>
</destination>
<tools>
<tool id="bwa" destination="dest_cores_time" resources="cores_time"/>
</tools>
<resources default="default">
<group id="default"></group>
<group id="cores_time">cores,time</group>
</resources>
In a separate test, I set the lib/galaxy/datatypes/converters/gz_to_uncompressed.xml tool to use local destination instead of the default resource params but it fails.
<tool id="CONVERTER_gz_to_uncompressed" destination="local"/>
There may be other converters/tools used when an input file needs to be uncompressed.
The tools that are failing, bwa, bowtie2 and minimap2, all have an option to create a reference genome index from a fasta file in the history.
The alevin tool that was failing for us also had that. But then I think we found other tools failing in a similar way that were not mappers.
I renamed this issue since it is related with the converters or other metadata? tools unable to locate the job_resource_params_conf.xml file and not with the toolshed tools' configs. I think the converters do not need to use the job_resource_params_conf.xml file at least for cores, time, memory.
We have the following recently installed version:
The job resource selector (cores and time) is working perfect for default tools and these toolshed tools: Trimmomatic, SPAdes, FastQC and 'Compute Quality Statistics'. The toolshed tools bwa, bwa-mem, bowtie2 and minimap2 fail immediately with the following pop up error message after clicking Execute without being scheduled to run.
Here is the galaxy.log