zhxiaokang / RASflow

RNA-Seq analysis workflow
MIT License
104 stars 57 forks source link

DEA error #28

Closed Ben7124 closed 2 years ago

Ben7124 commented 2 years ago

Hello,

For some reason, I am getting an error in the DEA steps and subsequent visualization. It says there is an error in the data frame, but I am not sure how to solve this error. Any help is appreciated. Thank you.

rule DEA: input: output/analysis/genome/dea/countGroup/Treatment_1_gene_count.tsv, output/analysis/genome/dea/countGroup/Untreated_gene_count.tsv output: output/analysis/genome/dea/countGroup/Untreated_gene_norm.tsv, output/analysis/genome/dea/countGroup/Treatment_1_gene_norm.tsv, output/analysis/genome/dea/DEA/dea_Untreated_Treatment_1.tsv, output/analysis/genome/dea/DEA/deg_Untreated_Treatment_1.tsv jobid: 1

Loading required package: limma Loading required package: S4Vectors Loading required package: stats4 Loading required package: BiocGenerics Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked from ‘package:limma’:

plotMA

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
colnames, colSums, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
lengths, Map, mapply, match, mget, order, paste, pmax, [pmax.int](http://pmax.int/),
pmin, [pmin.int](http://pmin.int/), Position, rank, rbind, Reduce, rowMeans, rownames,
rowSums, sapply, setdiff, sort, table, tapply, union, unique,
unsplit, which, which.max, which.min

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

expand.grid

Loading required package: IRanges Loading required package: GenomicRanges Loading required package: GenomeInfoDb Loading required package: SummarizedExperiment Loading required package: Biobase Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: DelayedArray Loading required package: matrixStats

Attaching package: ‘matrixStats’

The following objects are masked from ‘package:Biobase’:

anyMissing, rowMedians

Loading required package: BiocParallel

Attaching package: ‘DelayedArray’

The following objects are masked from ‘package:matrixStats’:

colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following objects are masked from ‘package:base’:

aperm, apply

Error in $<-.data.frame(*tmp*, "subject", value = integer(0)) : replacement has 0 rows, data has 4 Calls: DEA -> $<- -> $<-.data.frame Execution halted [Sun Jul 3 02:54:56 2022] Error in rule DEA: jobid: 1 output: output/analysis/genome/dea/countGroup/Untreated_gene_norm.tsv, output/analysis/genome/dea/countGroup/Treatment_1_gene_norm.tsv, output/analysis/genome/dea/DEA/dea_Untreated_Treatment_1.tsv, output/analysis/genome/dea/DEA/deg_Untreated_Treatment_1.tsv

RuleException: CalledProcessError in line 38 of /root/RASflow/workflow/dea_genome.rules: Command ' set -euo pipefail; Rscript scripts/dea_genome.R ' returned non-zero exit status 1. File "/root/RASflow/workflow/dea_genome.rules", line 38, in __rule_DEA File "/root/miniconda3/envs/rasflow/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job DEA since they might be corrupted: output/analysis/genome/dea/countGroup/Untreated_gene_norm.tsv, output/analysis/genome/dea/countGroup/Treatment_1_gene_norm.tsv Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /root/RASflow/.snakemake/log/2022-07-03T025450.360606.snakemake.log DEA is done! Start visualization of DEA results! Building DAG of jobs... Using shell: /usr/bin/bash Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 1 end 1 plot 2

[Sun Jul 3 02:54:56 2022] rule plot: input: output/analysis/genome/dea/countGroup, output/analysis/genome/dea/DEA output: output/analysis/genome/dea/visualization/volcano_plot_Untreated_Treatment_1.pdf, output/analysis/genome/dea/visualization/heatmap_Untreated_Treatment_1.pdf jobid: 1

Loading required package: plotscale hash-3.0.1 provided by Decision Patterns

Loading required package: GenomicFeatures Loading required package: BiocGenerics Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
colnames, colSums, dirname, do.call, duplicated, eval, evalq,
Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
lengths, Map, mapply, match, mget, order, paste, pmax, [pmax.int](http://pmax.int/),
pmin, [pmin.int](http://pmin.int/), Position, rank, rbind, Reduce, rowMeans, rownames,
rowSums, sapply, setdiff, sort, table, tapply, union, unique,
unsplit, which, which.max, which.min

Loading required package: S4Vectors Loading required package: stats4

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:hash’:

values, values<-

The following object is masked from ‘package:base’:

expand.grid

Loading required package: IRanges Loading required package: GenomeInfoDb Loading required package: GenomicRanges Loading required package: AnnotationDbi Loading required package: Biobase Welcome to Bioconductor

Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.

Attaching package: ‘AnnotationDbi’

The following objects are masked from ‘package:hash’:

keys, keys<-

Loading required package: ggplot2 Loading required package: ggrepel Error in file(file, "rt") : cannot open the connection Calls: plot.volcano.heatmap -> read.table -> file In addition: Warning message: In file(file, "rt") : cannot open file 'output/analysis/genome/dea/DEA/dea_Untreated_Treatment_1.tsv': No such file or directory Execution halted [Sun Jul 3 02:55:01 2022] Error in rule plot: jobid: 1 output: output/analysis/genome/dea/visualization/volcano_plot_Untreated_Treatment_1.pdf, output/analysis/genome/dea/visualization/heatmap_Untreated_Treatment_1.pdf

RuleException: CalledProcessError in line 53 of /root/RASflow/workflow/visualize.rules: Command ' set -euo pipefail; Rscript scripts/visualize.R output/analysis/genome/dea/countGroup output/analysis/genome/dea/DEA output/analysis/genome/dea/visualization ' returned non-zero exit status 1. File "/root/RASflow/workflow/visualize.rules", line 53, in __rule_plot File "/root/miniconda3/envs/rasflow/lib/python3.6/concurrent/futures/thread.py", line 56, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /root/RASflow/.snakemake/log/2022-07-03T025456.705370.snakemake.log Visualization is done! RASflow is done!

Ben7124 commented 2 years ago

Hello, I think I solved the issue. The metadata file did not have the subject column which I re-added and it produced the pdf plots now.