nanoporetech / pipeline-transcriptome-de

Pipeline for differential gene expression (DGE) and differential transcript usage (DTU) analysis using long reads
Other
105 stars 26 forks source link

syntax error: Input and output files have to be specified as strings or lists of strings. #33

Open peterthorpe5 opened 2 years ago

peterthorpe5 commented 2 years ago

Dear Nanoporetech,

I am having an issue running this. I have altered the config.yaml (pasted below). I read an another issue that full paths were required so I added these, but have removed identifying names. (also can the pipeline take compressed fq files?). LIne 15 is the transcriptome: "/PATH/TO/analysis/GRCh38.primary_assembly.genome.fa" - line. I cant see what is wrong with this. Sorry! can you please help? Pete

I get the following error:

pipeline-transcriptome-de]$ snakemake --use-conda -j 24 all SyntaxError: Input and output files have to be specified as strings or lists of strings. File "/PATH/analysis/pipeline-transcriptome-de/Snakefile", line 15, in File "/PATH/analysis/pipeline-transcriptome-de/snakelib/utils.snake", line 15, in

General pipeline parameters:

Name of the pipeline:

pipeline: "pipeline-transcriptome-de_phe"

ABSOLUTE path to directory holding the working directory:

workdir_top: "/PATH/TO/analysis/"

Results directory:

resdir: "results"

Repository URL:

repo: "https://github.com/nanoporetech/pipeline-transcriptome-de"

Pipeline-specific parameters:

Transcriptome fasta

transcriptome: "/PATH/TO/analysis/GRCh38.primary_assembly.genome.fa"

Annotation GFF/GTF

annotation: "/PATH/TO/analysis/gencode.v39.annotation.gff3"

Control samples

controlsamples: C1: "/PATH/TO/analysis/R1.fastq.gz" C2: "/PATH/TO/analysis/R2.fastq.gz" C3: "/PATH/TO/analysis/R3.fastq.gz"

Treated samples

treatedsamples: IR1: "/PATH/TO/analysis/R4.fastq.gz" IR2: "/PATH/TO/analysis/R5.fastq.gz" IR3: "/PATH/TO/analysis/R6.fastq.gz"

Minimap2 indexing options

minimap_index_opts: ""

Minimap2 mapping options

minimap2_opts: ""

Maximum secondary alignments

maximum_secondary: 100

Secondary score ratio (-p for minimap2)

secondary_score_ratio: 1.0

Salmon library type

salmon_libtype: "U"

Count filtering options - customize these according to your experimental design:

Genes expressed in minimum this many samples

min_samps_gene_expr: 3

Transcripts expressed in minimum this many samples

min_samps_feature_expr: 1

Minimum gene counts

min_gene_expr: 10

Minimum transcript counts

min_feature_expr: 3

Threads

threads: 24

peterthorpe5 commented 2 years ago

Is it possible for someone to give me some guidance here?

EnJun-Yang commented 2 years ago

Hi,

I've encountered the same issue with running the transcriptomics-de pipeline; though I should note that I'm using paired branch of the pipeline (https://github.com/nanoporetech/pipeline-transcriptome-de/tree/paired_dge_dtu).

In my case it appears to be a rule that's been implemented in the snakelib/utils.snake document that is preventing the pipeline from recognising the input files.

I've tried hashing out that line in the Snakefile (line 15, include: "snakelib/utils.snake"), and so far the pipeline appears to be running (now on the mapping step). Happy to discuss more, and would love to hear feedback from the ONT side on what might need to be updated.

EnJun PS: Also replied with something similar on the ONT community forums

peterthorpe5 commented 2 years ago

@EnJun-Yang thank you for your reply :)