nf-core / smrnaseq

A small-RNA sequencing analysis pipeline
https://nf-co.re/smrnaseq
MIT License
71 stars 121 forks source link

Multiple Sample running Issue #117

Closed rakeshponnala closed 2 years ago

rakeshponnala commented 2 years ago

Hello, I'm trying to run multiple microRNA samples and the pipeline start to die at the trimming step : Any suggestions

nextflow run nf-core/smrnaseq -r 1.1.0 --input sample.sheet.csv -profile singularity --protocol qiaseq --fasta genome.fa --mirtrace_species cfa --mirna_gtf cfa.gff3 --bt_index /pathtogenomebowtie/ --hairpin hairpin.fa --mature mature.fa --mirtrace_protocol qiaseq --max_cpus 90 --max_memory 1000GB &&

eg: sample.sheet.csv is below:

sample, fastq_1 01-C04,01-C04_S15_R1_001.fastq.gz 01-C04, 01-C04_S3_R1_001.fastq.gz 01-C07, 01-C07S5_R1_001.fastq.gz 01-C08, 01-C08S9_R1_001.fastq.gz 01-C10, 01-C10S8_R1_001.fastq.gz 01-D01,01-D01S4_R1_001.fastq.gz 01-D02,01-D02S1_R1_001.fastq.gz 01-D02,01-D02S2_R1_001.fastq.gz 01-D03, 01-D03__S2_R1_001.fastq.gz

Error executing process > 'trim_galore (sample.sheet.csv)'

Caused by: Process trim_galore (sample.sheet.csv) terminated with an error exit status (25)

Command executed:

trim_galore --adapter AACTGTAGGCACCATCAAT --length 17 --max_length 40 --gzip sample.sheet.csv --fastqc

Command exit status: 25

Command output: (empty)

Command error: Multicore support not enabled. Proceeding with single-core trimming. Path to Cutadapt set as: 'cutadapt' (default) Cutadapt seems to be working fine (tested command 'cutadapt --version') Cutadapt version: 3.4 single-core operation. No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)

Maximum length cutoff set to >> 40 bp <<; sequences longer than this threshold will be removed (only advised for smallRNA-trimming!)

File seems to be in SOLiD colorspace format which is not supported by Trim Galore (sequence is: '01-C04,01-C04_S15_R1_001.fastq.gz ')! Please use Cutadapt on colorspace files separately and check its documentation!

lpantano commented 2 years ago

Hi,

that version of the pipeline doesn't use the input file, it needs the path where all the fastq files are. And we have seen issues when using the genome and bowtie index as inputs. It is better to leave the pipeline to create the indexes. As well, I will recommend to use the devel branch of the pipeline.

We are in an effort to migrate to the new nextflow DSL 2.0 version. There is a PR to devel that will be the latest version soon. That one uses the input file: #104. Hopefully it will available soon.

Thanks

rakeshponnala commented 2 years ago

Thank you.. this is very helpful...

klkeys commented 2 years ago

@rakeshponnala before #104 gets accepted, you can still run the DSL 2.0 version from the dsl2 branch

it should work, but your mileage may vary

rakeshponnala commented 2 years ago

thanks. would I just need to specify the version name from -r 1.1.0 to -r 2.0 to test it ? Thanks so much..

klkeys commented 2 years ago

no I think that you need -r dsl2 (the Git branch) or possibly -r ac9051934ec4e5e9726ebbe326968218fbf5d497 (the Git commit hash)

rakeshponnala commented 2 years ago

Thank you! I will test this version and hope for the best...

rakeshponnala commented 2 years ago

Unfortunately none of them works .. tried both... Error message says Unknown error accessing project nf-core/smrnaseq -- Repository may be corrupted: command line:

lpantano commented 2 years ago

is this command working for you:

 nextflow run nf-core/smrnaseq -r dsl2 -profile docker,test

what nextflow version are you using?

rakeshponnala commented 2 years ago

Thanks for the suggestion. I executed the above command and the test works now... Now - I tried a set of samples with this pipeline using dsl2 , the pipeline runs great except it throws an error at the end . Seem to be related to multiQC. I'm pasting the error message. I really couldn't figure out what the issue is.

Error executing process > 'NFCORE_SMRNASEQ:SMRNASEQ:MULTIQC'

Caused by: Process NFCORE_SMRNASEQ:SMRNASEQ:MULTIQC terminated with an error exit status (1)

Command executed:

multiqc -f .

cat <<-END_VERSIONS > versions.yml MULTIQC: multiqc: $( multiqc --version | sed -e "s/multiqc, version //g" ) END_VERSIONS

Command exit status: 1

Command output: | searching | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 611/611

Command error:

| multiqc | Search path : /work/02/7fb42af75c84a2814f48b4c99613eb | custom_content | nf-core-smrnaseq-summary: Found 1 sample (html) | custom_content | software_versions: Found 1 sample (html)

Please copy this log and report it at │ │ https://github.com/ewels/MultiQC/issues │ │ Please attach a file that triggers the error. The last file found was: │ │ ./mirtrace-stats-mirna-complexity.tsv │ │ │ │ Traceback (most recent call last): │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/plots/bargraph.py", l │ │ return get_template_mod().bargraph(plotdata, plotsamples, pconfig) │ │ AttributeError: module 'multiqc.templates.default' has no attribute 'bargrap │ │ │ │ During handling of the above exception, another exception occurred: │ │ │ │ Traceback (most recent call last): │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/multiqc.py", line 624 │ │ output = mod() │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/modules/mirtrace/mirt │ │ plot=self.mirtrace_contamination_check(), │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/modules/mirtrace/mirt │ │ return bargraph.plot(self.contamination_data, keys, config) │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/plots/bargraph.py", l │ │ matplotlib_bargraph(plotdata, plotsamples, pconfig) │ │ File "/usr/local/lib/python3.9/site-packages/multiqc/plots/bargraph.py", l │ │ axes.barh( │ │ File "/usr/local/lib/python3.9/site-packages/matplotlib/axes/_axes.py", li │ │ patches = self.bar(x=left, height=height, width=width, bottom=y, │ │ File "/usr/local/lib/python3.9/site-packages/matplotlib/init.py", line │ │ return func(ax, *map(sanitize_sequence, args), **kwargs) │ │ File "/usr/local/lib/python3.9/site-packages/matplotlib/axes/_axes.py", li │ │ color = itertools.chain(itertools.cycle(mcolors.to_rgba_array(color)), │ │ File "/usr/local/lib/python3.9/site-packages/matplotlib/colors.py", line 3 │ │ raise ValueError("Using a string of single character colors as " │ │ ValueError: Using a string of single character colors as a color sequence is