nanoporetech / pipeline-transcriptome-de

Pipeline for differential gene expression (DGE) and differential transcript usage (DTU) analysis using long reads
Other
105 stars 26 forks source link

Running the pipeline with only two samples #6

Closed drc111 closed 5 years ago

drc111 commented 5 years ago

I am running the pipeline with only two samples (one per condition) but when I get to the R script-part of the Snakefile I get the following error (probably since R is missing additional inputs from the samples I have removed):

"Finished job 2. 9 of 12 steps (75%) done Loading counts, conditions and parameters. Loading annotation database. Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Warning message: In .get_cds_IDX(type, phase) : The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored. 'select()' returned 1:many mapping between keys and columns Filtering counts using DRIMSeq. Building model matrix. Sum transcript counts into gene counts. Warning message: funs() is soft deprecated as of dplyr 0.8.0 please use list() instead

Before:

funs(name = f(.))

After:

list(name = ~ f(.)) This warning is displayed once per session. Running differential gene expression analysis using edgeR. Warning message: In estimateDisp.default(y = y$counts, design = design, group = group, : No residual df: setting dispersion to NA Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, : dispersion must be numeric Calls: glmQLFit ... glmQLFit -> glmQLFit.default -> glmFit -> glmFit.default Execution halted Error in job de_analysis while creating output files de_analysis/results_dge.tsv, de_analysis/results_dge.pdf, de_analysis/results_dtu_gene.tsv, de_analysis/results_dtu_transcript.tsv, de_analysis/results_dtu_stageR.tsv, merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv. RuleException: CalledProcessError in line 137 of /home/nanopore/tests/3/Snakefile: Command ' /home/nanopore/tests/3/scripts/de_analysis.R ' returned non-zero exit status 1. File "/home/nanopore/tests/3/Snakefile", line 137, in __rule_de_analysis File "/home/nanopore/src/miniconda3/envs/pipeline2/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job de_analysis since they might be corrupted: merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message"

Is there anyway to adjust the R script to take into account that I am running with two samples/one per group (control/treated)?

bsipos commented 5 years ago

Unfortunately the underlying methods require at least two biological replicates to work (and many more is recommended). So there is no way to run the pipeline with juts one replicate per condition.