TRON-Bioinformatics / EasyFuse

EasyFuse is a pipeline for accurate fusion gene detection from RNA-seq data.
GNU General Public License v3.0
51 stars 12 forks source link

Fastq files not picked up-running through singularity #32

Closed stanleyrc closed 1 year ago

stanleyrc commented 1 year ago

Hi I am opening a new issue since I was not sure if my comment on a closed issue would be seen. I am having an issue as well when trying to run with singularity as noted in a different issue. I have tried uncompressed and compressed fastq and the fastq files are still not being identified. I have also tried to use only 1 fastq and specify that 1 fastq exactly. I have put the output from the slurm jobs below for both 1 fastq and using *fastq as someone mentioned to try on another issue. Is there a workaround for this? Thank you!

Output from slurm job: INFO: Using cached SIF image WARNING: skipping mount of /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/fastq: stat /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/fastq: no such file or directory FATAL: container creation failed: mount /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/fastq->/data error: while mounting /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/fastq: mount source /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/*fastq doesn't exist

Output of easyfuse_processing.log: 2022-12-20 12:58:15 DEBUG Folder /output already exists 2022-12-20 12:58:15 INFO Starting easyfuse: CMD - python /code/easyfuse/processing.py -i /data -o /output 2022-12-20 12:58:15 INFO Pipeline Version: 1.3.7 2022-12-20 12:58:15 INFO Reference Genome: hg38, Reference Transcriptome: ensembl 2022-12-20 12:58:15 DEBUG Submitting job: CMD - /code/miniconda3/bin/python /code/easyfuse/summarize_data.py --input /output --model_predictions -c /code/easyfuse/config.ini; PATH - /output; DEPS - []

Fastq files do exist but are not being picked up: ls /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/*fastq /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/ETV001_L001_R1.fastq /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/ETV001_L001_R2.fastq

Command line code: singularity exec --containall --bind /gpfs/home/src7305/aifantishome/easyfuse/easyfuse_ref:/ref --bind /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/*fastq:/data --bind /gpfs/home/src7305/aifantishome/easyfuse/output:/output docker://tronbioinformatics/easyfuse:latest python /code/easyfuse/processing.py -i /data/ -o /output/

output from slurm file when I specify one fastq: INFO: Using cached SIF image

Going to process the following read files... Running EasyFuse-Summary-1671559684_CMD0 CMD: /code/miniconda3/bin/python /code/easyfuse/summarize_data.py --input /output --model_predictions -c /code/easyfuse/config.ini Error: Command "['/code/miniconda3/bin/python', '/code/easyfuse/summarize_data.py', '--input', '/output', '--model_predictions', '-c', '/code/easyfuse/config.ini']" returned non-zero exit status b'Traceback (most recent call last):\n File "/code/easyfuse/summarize_data.py", line 133, in \n main()\n File "/code/easyfuse/summarize_data.py", line 130, in main\n stats.run(args.model_predictions)\n File "/code/easyfuse/summarize_data.py", line 90, in run\n print("Found {0} (partially) processed samples in {1}. Data will be collected from {2} samples for which fetchdata has been run.".format(i, self.input_path, count_valid_sample))\nUnboundLocalError: local variable 'i' referenced before assignment\n'

ibn-salem commented 1 year ago

I think the problem is the wildcard character * in the --bind argument in your singularity command:

--bind /gpfs/home/src7305/aifantishome/easyfuse/SOFTLINKS/1_FASTQ_TEST/*fastq:/data

You might need to bind a distinct folder as /data.