ukhsa-collaboration / PneumoCaT

Pneumococcal Capsular Typing tool for NGS data
GNU General Public License v3.0
15 stars 8 forks source link

PneumoCaT in Galaxy not running #24

Open quacksawbones opened 3 years ago

quacksawbones commented 3 years ago

Hi Folks,

I have recently installed PneumoCaT onto our Galaxy instance (which is connected to our HPC cluster using PBS).

Unfortunately, when I try to run it, these are the ouputs I am getting in: STDOUT

running bowtie index
There was an error in the function 'remove_secondary_mapping_bit'
____________________________________________________________
____________________________________________________________
There was an error in the function 'mapping'
____________________________________________________________
____________________________________________________________

STDERR

Traceback (most recent call last):
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/bin/modules/utility_functions.py", line 59, in try_and_except
    return function(*parameters, **named_parameters)
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/bin/modules/Serotype_determiner_functions.py", line 159, in remove_secondary_mapping_bit
    for line in lines:
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/lib/python2.7/fileinput.py", line 237, in next
    line = self._readline()
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/lib/python2.7/fileinput.py", line 339, in _readline
    self._file = open(self._filename, self._mode)
IOError: [Errno 2] No such file or directory: 'outputs/tmp/PHESPV0253_R1.sam'
Traceback (most recent call last):
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/bin/modules/utility_functions.py", line 59, in try_and_except
    return function(*parameters, **named_parameters)
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/bin/modules/Serotype_determiner_functions.py", line 123, in mapping
    try_and_except(input_directory + "/logs/strep_pneumo_serotyping.stderr", remove_secondary_mapping_bit, sam, sam_parsed)
  File "/hpc/software/installed/galaxy/20.01/database/dependencies/_conda/envs/__pneumocat@1.2.1/bin/modules/utility_functions.py", line 77, in try_and_except
    error_file = open(error_filepath, "a")
IOError: [Errno 2] No such file or directory: '/logs/strep_pneumo_serotyping.stderr'

I'm pretty sure that the issue is that when installing PneumoCaT into Galaxy, the other prerequisites (namely Pythoin 2.7.5, bowtie2, samtools and the various Python modules) aren't being installed at the same time. And because we're running Galaxy on a compute cluster, because these aren't automatically present on the compute nodes, the job fails.

Are you able to update the PneumoCaT Galaxy code to include the prerequisite software and modules (so Conda rolls it all into one environment) or otherwise provide some advice on how I can get this working?

Thanks folks,

CarmenSheppard commented 2 years ago

Hi,

Sorry for the very delayed reply and you may have fixed this since. I am afraid I do not have any experience with galaxy and have never tried to run Pneumocat within Galaxy. We are working on a new version of PneumoCaT which unfortunatley has been delayed by covid so will not be doing any more updates to this version of pneumocat.

I asked one of our core bioinformaticians - who suggested it was due to the output locations and place the logfile is being written to. You probably need adapt the code so that it will write to custom locations regarding the log file. That location is probably hardcoded somehow to be relative to fastq input files, which won't work in Galaxy.

CarmenSheppard commented 2 years ago

I actually ran into this issue recently trying to run pneumocat from conda (it's been a long while since I tried this). It seems that the version of BioPython is at fault. I found a thread here regarding the same error that was showing in the stderr file created. Regarding shared libraries. Unfortunatley the error reported by PneumoCaT is misleading and not helpful at all. The fix seemed to be for me - as stated in the thread linked above - downgrade tbb: conda install -c bioconda tbb=2020.2

Not sure if this is possible for you with Galaxy. Sorry I was not very helpful earlier!

quacksawbones commented 2 years ago

Not a problem, I appreciate the suggestion. I'll have a go as soon as I get an opportunity and let you know how it goes.