CCBR / RENEE

A comprehensive quality-control and quantification RNA-seq pipeline
https://CCBR.github.io/RENEE/
MIT License
4 stars 4 forks source link

rNA report VOOM error with test data #80

Closed kopardev closed 10 months ago

kopardev commented 10 months ago
[Wed Jan  3 11:39:49 2024]
rule rna_report:
    input: /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/DEG_ALL/RSEM.genes.expected_counts.all_samples.reformatted.tsv, /data/Zieg
elbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/DEG_ALL/combined_TIN.tsv, /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_24
0103/Reports/multiqc_matrix.tsv
    output: /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/Reports/RNA_Report.html
    jobid: 0
    reason: Missing output files: /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/Reports/RNA_Report.html
    resources: mem_mb=1000, mem_mib=954, disk_mb=1000, disk_mib=954, tmpdir=/tmp

    # Avoids inheriting $R_LIBS_SITE
    # from local env variables
    R_LIBS_SITE=/usr/local/lib/R/site-library
    # Generate RNA QC Dashboard
    workflow/scripts/rNA.R         -m workflow/scripts/rNA_flowcells.Rmd         -r /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/D
EG_ALL/RSEM.genes.expected_counts.all_samples.reformatted.tsv         -t /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/DEG_ALL/comb
ined_TIN.tsv         -q /data/Ziegelbauer_lab/circRNADetection/tmp_kopardevn/RENEE_testing_240103/Reports/multiqc_matrix.tsv         -o /data/Ziegelbauer_lab/c
ircRNADetection/tmp_kopardevn/RENEE_testing_240103/Reports         -f RNA_Report.html

Activating singularity image /data/CCBR_Pipeliner/SIFS/ccbr_rna_v0.0.1.sif

processing file: rNA_flowcells.Rmd
Quitting from lines 31-88 (rNA_flowcells.Rmd)
Error in voom(deg, normalize = "quantile", plot = TRUE, save.plot = TRUE) :
  Need at least two genes to fit a mean-variance trend
Calls: <Anonymous> ... withCallingHandlers -> withVisible -> eval -> eval -> voom
In addition: Warning message:
In filterByExpr.DGEList(deg) :
  All samples appear to belong to the same group.
Execution halted
[Wed Jan  3 11:39:57 2024]
Error in rule rna_report:
kelly-sovacool commented 10 months ago

I opened this up in an interactive RStudio session on biowulf using the dataset in /scratch/RENEE_testing_240103. I see that only one gene passes the filterByExpr function, causing this error in voom. However, on a different run of the test data I get 6 genes passing the filter. So it seems for some reason RSEM is giving different expected counts on the same dataset.

works - RENEE git commit hash 6987849

this hash was from an unmerged branch which changed the multiqc container just after v2.5.8, so RSEM should have behaved the same as it did in v2.5.8.

rawcounts <- read.table(file = '/data/sovacoolkl/renee_test/DEG_ALL/RSEM.genes.expected_counts.all_samples.reformatted.tsv', sep = '\t', header = TRUE, row.names = 1, quote = "")
deg <- edgeR::DGEList(counts = rawcounts)
keep_genes <- edgeR::filterByExpr(deg)
keep_genes %>% Filter(function(x) isTRUE(x),.) %>% dput()
c(`ENSG00000185658.13|BRWD1` = TRUE, `ENSG00000157540.21|DYRK1A` = TRUE, 
`ENSG00000160294.11|MCM3AP` = TRUE, `ENSG00000141959.17|PFKL` = TRUE, 
`ENSG00000182670.13|TTC3` = TRUE, `ENSG00000160201.11|U2AF1` = TRUE
)

fails - RENEE git commit hash 3c747aa

rawcounts <- read.table(file = '/data/sovacoolkl/RENEE_testing_240103/DEG_ALL/RSEM.genes.expected_counts.all_samples.reformatted.tsv', sep = '\t', header = TRUE, row.names = 1, quote = "")
deg <- edgeR::DGEList(counts = rawcounts)
keep_genes <- edgeR::filterByExpr(deg)
keep_genes %>% Filter(function(x) isTRUE(x),.) %>% dput()
c(`ENSMUSG00000106106.2|CT010467.1` = TRUE)
kelly-sovacool commented 10 months ago

I used this to launch a test run with v2.5.10:

module load ccbrpipeliner
renee run \
  --input /data/CCBR_Pipeliner/Pipelines/RENEE/develop/.tests/*.R1.fastq.gz \
  --output /data/sovacoolkl/renee_test_v2.5.10 \
  --genome hg38_30 \
  --mode slurm \
  --sif-cache /data/CCBR_Pipeliner/SIFS
kelly-sovacool commented 10 months ago

@kopardev is launching another test run with --genome hg38_30, he previously used the default (mouse) genome. This is almost certainly the cause of the issue, but let's wait to close it until Vishal's new test run succeeds.

kelly-sovacool commented 10 months ago

@kopardev did your new test run succeed?

kelly-sovacool commented 10 months ago

Vishal's test succeeded