icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
67 stars 23 forks source link

yara_mapper doesn't run properly #40

Closed mantczakaus closed 1 year ago

mantczakaus commented 1 year ago

Hi,

I've been having some issues with yara_mapper. I've started having them just recently (not sure why, possibly singularity was updated on my HPC and it's requiring different options now?). I'd be extremely grateful if you could have a look and provide some workaround or ideas for further investigation.

The step pre_map_hla which uses yara_mapper comes out as passed, however the dna_mapped_1.bam and dna_mapped_2.bam files are not ok. When then parsed to Optitype, the following error is generated:

Traceback (most recent call last):
    File "/opt/conda/bin/OptiTypePipeline.py", line 309, in <module>
      pos, read_details = ht.pysam_to_hdf(bam_paths[0])
    File "/opt/conda/bin/hlatyper.py", line 186, in pysam_to_hdf
      sam = pysam.AlignmentFile(samfile, sam_or_bam)
    File "pysam/libcalignmentfile.pyx", line 742, in pysam.libcalignmentfile.AlignmentFile.__cinit__
    File "pysam/libcalignmentfile.pyx", line 991, in pysam.libcalignmentfile.AlignmentFile._open
  ValueError: file has no sequences defined (mode='rb') - is it SAM/BAM format? Consider opening with check_sq=False.

samtools quickcheck run on the bam files gives the following message: dna_mapped_1.bam had no targets in header. Logs from yara_mapper actually have an error (it looks like related to using the image?) Couldn't create temporary file /scratch/temp/5427141/SQNseNL7c. (No such file or directory) /home/mi/dadi/workspace/development/seqan/include/seqan/file/string_mmap.h:635 FAILED! (Memory Mapped String couldn't open temporary file) I launched the same command manually and it worked. I entered the image env the following way:

set +u; env - PATH="$PATH" ${TMP:+SINGULARITYENV_TMP="$TMP"} ${TMPDIR:+SINGULARITYENV_TMPDIR="$TMPDIR"} SINGULARITYENV_NXF_DEBUG=${NXF_DEBUG:=0}  \
singularity shell  \
-B /scratch/project_mnt/S0091/mantczak  \
-B /QRISdata/Q5952/data/nextNEOpi_1.3_resources/references/yara --no-home --containall -H /scratch/project_mnt/S0091/mantczak/.tmp  \
-B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets  \
-B /scratch/project_mnt/S0091/mantczak/.tmp  \
-B /QRISdata/Q5952/data/nextNEOpi_1.3_resources  \
-B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0  \
-B /QRISdata/Q5952/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb  \
-B /QRISdata/Q5952/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_SINGULARITY_CACHEDIR/apps-01.i-med.ac.at-images-singularity-nextNEOpi_1.3.2_18734d43.sif

Then I executed:

/scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/d8/9567afce930be0fd79060e6f7c9ad8
bash .command.sh

I'm attaching

Best wishes,

Magda

riederd commented 1 year ago

Hi,

it seems that /scratch/temp is not mounted into the container. When you launch the container manually, as you did, can you cd into /scratch/temp ?

mantczakaus commented 1 year ago

Hi @riederd thanks for coming back to me so quickly! I don't have access to /scratch/temp when I'm in the container bash: cd: /scratch/temp: No such file or directory In the scratch folder I only have project_mnt which is our cluster's folder structure mounted into the container.

riederd commented 1 year ago

Hi,

I guess the --containall option might cause this problem. If you need to keep it, you might try to set tmpDir = "/scratch/temp" in conf/params.config.

mantczakaus commented 1 year ago

Hi, it turns out I don't have access to /scratch/temp and I have to use --containall, otherwise NeoFuse is failing (see https://github.com/icbi-lab/nextNEOpi/issues/25). Would you have any ideas how to work around that? For example, is there a way to switch off running NeoFuse? I don't need to consider gene fusions at the moment but I do need gene expression information. If I understand correctly this info comes from NeoFuse. I will also contact my cluster administrators to see if they would have any recommendations.

mantczakaus commented 1 year ago

Hi again, it seems that there has been a problem with environment variables being set. I'm passing all the variables through my .bashrc and $TMPDIR started being overwritten to this /scratch/temp folder. I set all the variables directly in the Slurm script. I will close this issue if this solves it. Thanks for your help!