Closed jcdaneshmand closed 1 year ago
What ended up working for me was creating a process for unzipping the reads like this:
process UNZIP_READS {
label 'process_low'
conda (params.enable_conda ? "conda-forge::sed=4.7" : null)
container "${workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/ubuntu:20.04' :
'ubuntu:20.04'}"
input:
tuple val(meta), path(reads)
output:
tuple val(meta), path('*uncompressed.fastq'), emit: ch_unzipped_fastq
// Set a longer execution time limit for this process (e.g., 8 hours)
time '48h'
script:
"""
#!/bin/bash
# Create an array from the space-separated list
read -a files <<< "$reads"
# Define a function to unzip a file
unzip_file() {
inputFile="\$1"
outputFile="\$(dirname \${inputFile})/\$(basename \${inputFile}).uncompressed.fastq"
gunzip -c \${inputFile} > \${outputFile}
}
# Iterate over input files and call the unzip_file function
for file in "\${files[@]}"; do
\$(unzip_file "\$file")
done
"""
}
And then within the align star workflow:
include { UNZIP_READS } from '../../modules/nf-core/modules/gunzip/UNZIP_READS'
workflow ALIGN_STAR {
take:
reads // channel: [ val(meta), [ reads ] ]
index // channel: /path/to/star/index/
gtf // channel: /path/to/genome.gtf
star_ignore_sjdbgtf // value: ignore gtf
seq_platform // value: sequencing platform
seq_center // value: sequencing centre
main:
ch_unzipped_fastq = Channel.empty()
ch_versions = Channel.empty()
// Call the UNZIP_READS process
UNZIP_READS(reads)
STAR_ALIGN (
//reads,
UNZIP_READS.out.ch_unzipped_fastq, // Use the unzipped fastq files as input
index,
gtf,
star_ignore_sjdbgtf,
seq_platform,
seq_center
)
Description of feature
Hello rnavar devs,
This is a feature request as well as an ask for assistance, if possible.
Essentially, I would like to add a small step in the pipeline, right before STAR alignment, in which the fastq.gz files are gunzipped and the resulting uncompressed files are fed to STAR. I need to do this because I am working on a wsl2 ubuntu server with NTFS storage . (STAR has an issue with gzs on NTFS, it gives a FIFO error).
I have tried to do this on my own by adding the following code to the ALIGN_STAR subworkflow:
`
Now when testing, I get the following exception, and I'm in over my head at the point. Nextflow is pretty new to me.
`
I'd really like a way to do this. I am trying to run a variant analysis with this pipeline on many RNA-seq samples. Is there any option I'm missing or something I'm doing wrong with my code? Thank you so much for any of your time.