nf-core / smrnaseq

A small-RNA sequencing analysis pipeline
https://nf-co.re/smrnaseq
MIT License
74 stars 125 forks source link

Error in 'NFCORE_SMRNASEQ:MIRTRACE' for paired data #421

Closed ranxm2 closed 2 months ago

ranxm2 commented 2 months ago

Description of the bug

I encountered an error while running the nf-core/smrnaseq pipeline. The process NFCORE_SMRNASEQ:MIRTRACE:MIRTRACE_RUN (1) failed with the error message indicating an invalid path value type: java.util.ArrayList. The error appears to involve handling multiple FASTQ files in a way that leads to a path type mismatch.

Command used and terminal output

# command used
nextflow run nf-core/smrnaseq \
   -profile conda \
  --input test.csv \
  --genome 'GRCh38' \
  --mirtrace_species 'hsa' \
  --protocol 'illumina' \
  --outdir test

# test.csv
sample,fastq_1,fastq_2
T1-1,00_fastq/T1-1_R1_001.fastq.gz,00_fastq/T1-1_R2_001.fastq.gz
T1-2,00_fastq/T1-2_R1_001.fastq.gz,00_fastq/T1-2_R2_001.fastq.gz

# out put
-[nf-core/smrnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_SMRNASEQ:MIRTRACE:MIRTRACE_RUN (1)'

Caused by:
  Not a valid path value type: java.util.ArrayList ([/panfs/compbio/users/xran2/wen/22q/smallRNA/work/56/75a1d860230508a2242f61e37356c4/T1-2_1.fastp.fastq.gz, /panfs/compbio/users/xran2/wen/22q/smallRNA/work/56/75a1d860230508a2242f61e37356c4/T1-2_2.fastp.fastq.gz])

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Linux

apeltzer commented 2 months ago

Can you please run -r dev ? This should be fixed on dev. If not, let us know.

ranxm2 commented 2 months ago

Can you please run -r dev ? This should be fixed on dev. If not, let us know.

Hi Apeltzer,

I just run with β€˜-r dev’, but it seems to have some new problem:

$ nextflow run nf-core/smrnaseq \

-profile singularity \ -r dev \ --input test.csv \ --genome 'GRCh38' \ --mirtrace_species 'hsa' \ --protocol 'illumina' \ --outdir result

N E X T F L O W ~ version 24.04.4

Project nf-core/smrnaseq contains uncommitted changes -- Cannot switch to revision: dev

could you spevify which version should I use?

apeltzer commented 2 months ago

Please pull the pipeline again, e.g. nextflow pull nf-core/smrnaseq and then run again. If that doesnt help, delete the cache from your home rm -rf ~/.nextflow/assets/nf-core/smrnaseq (or wherever you have your cache) and it should run.

ranxm2 commented 2 months ago

Please pull the pipeline again, e.g. nextflow pull nf-core/smrnaseq and then run again. If that doesnt help, delete the cache from your home rm -rf ~/.nextflow/assets/nf-core/smrnaseq (or wherever you have your cache) and it should run.

I try to rerun the pipeline but it seems still not working. It seems it is not working due to the mirTRACE can't be applied to the pair end reading fastq file.

I want to know if the pipeline can be applied to the paired fastq for current version?

apeltzer commented 2 months ago

The pipeline will only use the first read and the second doesn't make any sense as the first will already contain the full smRNA species. Please only use fastq_1 πŸ‘

I'll open an issue so that this gets automatically done if someone specifies Paired end data.

lpantano commented 2 months ago

we have better docs in the dev version as well now.

ranxm2 commented 2 months ago

The pipeline will only use the first read and the second doesn't make any sense as the first will already contain the full smRNA species. Please only use fastq_1 πŸ‘

I'll open an issue so that this gets automatically done if someone specifies Paired end data.

Got it. I try the dev mode but I met some new problem here:

Creating env using conda: bioconda::seqkit=2.8.2 [cache /panfs/compbio/users/xran2/wen/22q/smallRNA/work/conda/env-35f9e542ae4b0d84e99d029550579986]
WARN: Access to undefined parameter `monochromeLogs` -- Initialise it to a default value eg. `params.monochromeLogs = some_value`
ERROR ~ The AWS Access Key Id you provided does not exist in our records. (Service: Amazon S3; Status Code: 403; Error Code: InvalidAccessKeyId; Request ID: TKVK80AF5JQ0TM3F; S3 Extended Request ID: gAOp9Gei4H/cml3+nN8e+LwEphL6Py/Y0PXNRrzEgmsgek+3FZW6fwNiBIfvGyb6x4le6+KI1DzW7MwqA2AAiBsDnDGIny9F; Proxy: null)

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details
-[nf-core/smrnaseq] Pipeline completed with errors-
WARN: Killing running tasks (2)
lpantano commented 2 months ago

can you share the full .nextflow.log? That seems to be related to access the genome information. It happens to me when I am in the US region in some computers. Normally, I get always the genome file and provide it myself instead of using the genome parameters. The full log will tell us more, maybe.

ranxm2 commented 2 months ago

Hi, I just tried to run the previous version and the problem solved. Here is my code

nextflow run nf-core/smrnaseq \
  -r 2.3.0 \
  -profile singularity \
  --input fastq_samples.csv \
  --genome 'GRCh38' \
  --mirtrace_species 'hsa' \
  --protocol 'illumina' \
  --outdir result