BIMSBbioinfo / pigx_rnaseq

Bulk RNA-seq Data Processing, Quality Control, and Downstream Analysis Pipeline
GNU General Public License v3.0
20 stars 11 forks source link

STAR - unknown parameter genomeType - after guix install pigx-rnaseq #87

Closed smoe closed 3 years ago

smoe commented 3 years ago

Hello, I installed pigx-rnaseq now via guix as in

sudo apt-get install guix # which is the latest 1.2 of guix
guix install pigx-rnaseq # which made a good impression up to the mapping

to help locating the cause of yesterday's "group_by" issue (https://github.com/BIMSBbioinfo/pigx_rnaseq/issues/86) but don't get as far as with my Debian package of pigx-rnaseq. The previous output directory was removed, everything else was left invariant. Here is a hick-up in the mapping that is blocking me

[Wed Feb 24 01:35:32 2021]
Error in rule star_map:
    jobid: 58
    output: /home/moeller/FibrosisArrayExpress/output/mapped_reads/PR_Rep4_Aligned.out.bam
    log: /home/moeller/FibrosisArrayExpress/output/logs/star_map_PR_Rep4.log (check log file(s) for error message)
    shell:
        /gnu/store/zd7gdv173x41rmjdgc7qr3ycasna28hd-star-2.7.3a/bin/STAR  --runThreadN 4 --genomeDir /home/moeller/FibrosisArrayExpress/output/star_index --readFilesIn /home/moeller/FibrosisArrayExpress/output/trimmed_reads/PR_Rep4_R.fastq.gz --readFilesCommand '/gnu/store/378zjf2kgajcfd7mfr98jn5xyc5wa3qv-gzip-1.10/bin/gunzip  -c' --outSAMtype BAM Unsorted --outFileNamePrefix /home/moeller/FibrosisArrayExpress/output/mapped_reads/PR_Rep4_ >> /home/moeller/FibrosisArrayExpress/output/logs/star_map_PR_Rep4.log 2>&1
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job star_map since they might be corrupted:
/home/moeller/FibrosisArrayExpress/output/mapped_reads/PR_Rep4_Aligned.out.bam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /home/moeller/FibrosisArrayExpress/output/.snakemake/log/2021-02-24T013531.645151.snakemake.log

and the logfile of the failing process is

moeller@mariner2:~/FibrosisArrayExpress$ less /home/moeller/FibrosisArrayExpress/output/logs/star_map_PR_Rep3.log
sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
/gnu/store/mmhimfwmmidf09jw1plw3aw1g1zn2nkh-bash-static-5.0.16/bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
Feb 24 01:35:32 ..... started STAR run
Feb 24 01:35:32 ..... loading genome

EXITING: FATAL INPUT ERROR: unrecognized parameter name "genomeType" in input "genomeParameters.txt"
SOLUTION: use correct parameter name (check the manual)

Feb 24 01:35:32 ...... FATAL ERROR, exiting
/gnu/store/pwcp239kjf7lnj5i4lkdzcfcxwcfyk72-bash-minimal-5.0.16/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)

There could be some confusion since there are two versions of STAR on the system

$ which STAR
/usr/bin/STAR
$ STAR --version
2.7.8a
$ find /gnu/ -name STAR
/gnu/store/zd7gdv173x41rmjdgc7qr3ycasna28hd-star-2.7.3a/bin/STAR
/gnu/store/7bjj53q9lx87aw4ljs9b6wrs7qs4666v-star-2.7.3a/bin/STAR

I already asked to have some information on the binary that is executed in the STAR output (https://github.com/alexdobin/STAR/pull/1157). Is there something I can test/do to harden the process?

Best, Steffen

rekado commented 3 years ago

Hi Steffen,

we're only ever executing STAR (or any other tool) with the absolute fie name as specified in the configuration file in the tools section. The tools section is generated at configure time, so whichever STAR was available during the configure phase should be used for the star_map rule.

I don't know where "genomeParameters.txt" comes into play. Annoyingly, this is not declared explicitly as an input. I wonder what generates it.

@borauyar, could you enlighten us?

borauyar commented 3 years ago

STAR creates this file genomeParameters.txt when indexing the genome under the folder star_index. The STAR mapping requires this index folder as input. Reads the genome parameters from there. The issue could be that the STAR version that created the index is different from the STAR version that does the mapping.

smoe commented 3 years ago

Yes. That was it. Sorry. I had (previous discussion) set the limitGenomeGenerateRAM paramter for the indexing

   star_index:
     args: "--limitGenomeGenerateRAM 33724373088"
     executable: "/usr/bin/STAR"

and set the executable with it. No idea if STAR will receive a version and path information in any near future. Since

$ STAR --version
2.7.8a

is performed with no delay, maybe it would be a neat to add this to your pipeline?

smoe commented 3 years ago

I just saw that you are presenting the full path in the "outer" logfile. Still, because of docker and chroot being confusing at times, you may nonetheless want to execute the tool for their version information only.

borauyar commented 3 years ago

Yes. That was it. Sorry. I had (previous discussion) set the limitGenomeGenerateRAM paramter for the indexing

   star_index:
     args: "--limitGenomeGenerateRAM 33724373088"
     executable: "/usr/bin/STAR"

That's great. I leave it to @rekado to comment on the question about versions.