I am running the pipeline with only two samples (one per condition) but when I get to the R script-part of the Snakefile I get the following error (probably since R is missing additional inputs from the samples I have removed):
"Finished job 2.
9 of 12 steps (75%) done
Loading counts, conditions and parameters.
Loading annotation database.
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .get_cds_IDX(type, phase) :
The "phase" metadata column contains non-NA values for features of type
stop_codon. This information was ignored.
'select()' returned 1:many mapping between keys and columns
Filtering counts using DRIMSeq.
Building model matrix.
Sum transcript counts into gene counts.
Warning message:
funs() is soft deprecated as of dplyr 0.8.0
please use list() instead
Before:
funs(name = f(.))
After:
list(name = ~ f(.))
This warning is displayed once per session.
Running differential gene expression analysis using edgeR.
Warning message:
In estimateDisp.default(y = y$counts, design = design, group = group, :
No residual df: setting dispersion to NA
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, :
dispersion must be numeric
Calls: glmQLFit ... glmQLFit -> glmQLFit.default -> glmFit -> glmFit.default
Execution halted
Error in job de_analysis while creating output files de_analysis/results_dge.tsv, de_analysis/results_dge.pdf, de_analysis/results_dtu_gene.tsv, de_analysis/results_dtu_transcript.tsv, de_analysis/results_dtu_stageR.tsv, merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv.
RuleException:
CalledProcessError in line 137 of /home/nanopore/tests/3/Snakefile:
Command '
/home/nanopore/tests/3/scripts/de_analysis.R
' returned non-zero exit status 1.
File "/home/nanopore/tests/3/Snakefile", line 137, in __rule_de_analysis
File "/home/nanopore/src/miniconda3/envs/pipeline2/lib/python3.6/concurrent/futures/thread.py", line 56, in run
Removing output files of failed job de_analysis since they might be corrupted:
merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv
Will exit after finishing currently running jobs.
Exiting because a job execution failed. Look above for error message"
Is there anyway to adjust the R script to take into account that I am running with two samples/one per group (control/treated)?
Unfortunately the underlying methods require at least two biological replicates to work (and many more is recommended). So there is no way to run the pipeline with juts one replicate per condition.
I am running the pipeline with only two samples (one per condition) but when I get to the R script-part of the Snakefile I get the following error (probably since R is missing additional inputs from the samples I have removed):
"Finished job 2. 9 of 12 steps (75%) done Loading counts, conditions and parameters. Loading annotation database. Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Warning message: In .get_cds_IDX(type, phase) : The "phase" metadata column contains non-NA values for features of type stop_codon. This information was ignored. 'select()' returned 1:many mapping between keys and columns Filtering counts using DRIMSeq. Building model matrix. Sum transcript counts into gene counts. Warning message: funs() is soft deprecated as of dplyr 0.8.0 please use list() instead
Before:
funs(name = f(.))
After:
list(name = ~ f(.)) This warning is displayed once per session. Running differential gene expression analysis using edgeR. Warning message: In estimateDisp.default(y = y$counts, design = design, group = group, : No residual df: setting dispersion to NA Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset, : dispersion must be numeric Calls: glmQLFit ... glmQLFit -> glmQLFit.default -> glmFit -> glmFit.default Execution halted Error in job de_analysis while creating output files de_analysis/results_dge.tsv, de_analysis/results_dge.pdf, de_analysis/results_dtu_gene.tsv, de_analysis/results_dtu_transcript.tsv, de_analysis/results_dtu_stageR.tsv, merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv. RuleException: CalledProcessError in line 137 of /home/nanopore/tests/3/Snakefile: Command ' /home/nanopore/tests/3/scripts/de_analysis.R ' returned non-zero exit status 1. File "/home/nanopore/tests/3/Snakefile", line 137, in __rule_de_analysis File "/home/nanopore/src/miniconda3/envs/pipeline2/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job de_analysis since they might be corrupted: merged/all_counts_filtered.tsv, merged/all_gene_counts.tsv Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message"
Is there anyway to adjust the R script to take into account that I am running with two samples/one per group (control/treated)?