TobiTekath / DTUrtle

Perform differential transcript usage (DTU) analysis of bulk or single-cell RNA-seq data. See documentation at:
https://tobitekath.github.io/DTUrtle
GNU General Public License v3.0
17 stars 3 forks source link

Issues in plot_transcripts_view #2

Closed lijinw closed 2 years ago

lijinw commented 2 years ago

Hi, I have a issue in plotting transcripts view, it works fine when I plot any single gene, but I can't plot it together. I got isoform files from rsem. And I have finished the DTU analysis, and plot the barplot and heatmap, they all works perfect. I'm not sure if is a bug, or there is something wrong with my code. I would appreciate if you can give me some suggestions.

table_UT_sig <- plot_transcripts_view(dturtle = table_UT_sig, gtf = "gencode.vM25.annotation.gtf", genome = 'mm10', one_to_one = TRUE, savepath = "images_ut_transcript", add_to_table = "transcript_view")

Importing gtf file from disk.

Performing one to one mapping in gtf

Found gtf GRanges for 27 of 27 provided genes.

Fetching ideogram tracks ... Creating 27 plots: iteration: 1 Error: BiocParallel errors 1 remote errors, element index: 1 26 unevaluated and other errors first remote error: QuartzBitmap_Output - unable to open file 'images_ut_transcript/Acbd5_transcripts.png'

This the traceback traceback()

4: stop(.error_bplist(res)) 3: BiocParallel::bplapply(valid_genes, function(gene) { gene_gtf <- gtf[gtf@elementMetadata[[gtf_genes_column]] == gene, ] gene_info <- as.data.frame(gene_gtf[gene_gtf$type == "gene", ]) if (tested_transcripts_only %in% c("mixed", TRUE)) { tested_tx <- dturtle$FDR_table$txID[dturtle$FDR_table$geneID == gene & !is.na(dturtle$FDR_table$transcript)] if (length(tested_tx) == 0) { if (tested_transcripts_only == "mixed") { tested_tx <- dturtle$FDR_table$txID[dturtle$FDR_table$geneID == gene] } else { message("Skipping ", gene, " --- no transcripts to plot. You may want to adjust the tested_transcripts_only >parameter.") return() } } } else { tested_tx <- gene_gtf@elementMetadata[[gtf_tx_column]][gene_gtf$type == "transcript"] } gtf_trans <- gene_gtf[gene_gtf@elementMetadata[[gtf_tx_column]] %in% tested_tx & !gene_gtf$type %in% c("transcript", "gene")] if (length(gtf_trans) == 0) { message("Skipping ", gene, " --- no transcripts to plot. You may want to adjust the tested_transcripts_only >parameter.") return() } temp <- Gviz::seqlevels(gtf_trans) temp[!startsWith(temp, "chr")] <- paste0("chr", temp[!startsWith(temp, "chr")]) GenomeInfoDb::seqlevels(gtf_trans) <- temp track_list <- list() grtrack_list <- c() if (!is.null(genome)) { track_list <- append(track_list, ideoTracks[[gene_info$seqnames]]) } if (reduce_introns) { reduction_obj <- granges_reduce_introns(gtf_trans, reduce_introns_min_size) gtf_trans <- reduction_obj$granges } else { track_list <- append(track_list, Gviz::GenomeAxisTrack()) } tx_ranges <- as.data.frame(GenomicRanges::ranges(gtf_trans)) gtf_trans <- split(gtf_trans, gtf_trans@elementMetadata[[gtf_tx_column]]) grouped_mean_df <- get_diff(gene, dturtle) grouped_mean_df <- grouped_mean_df[tested_tx, , drop = FALSE] rownames(grouped_mean_df) <- tested_tx tested_tx <- rownames(grouped_mean_df)[order(abs(grouped_mean_df$diff), decreasing = TRUE)] for (tx_id in tested_tx) { gtf_tx <- gtf_trans[[tx_id]] if (all(c("CDS", "UTR") %in% unique(gtf_tx$type))) { gtf_tx <- gtf_tx[gtf_tx$type != "exon"] } grtrack <- Gviz::GeneRegionTrack(gtf_tx, transcript = gtf_tx$transcript_id, feature = gtf_tx$type, exon = gtf_tx$exon_id, gene = gtf_tx$gene_id, symbol = gtf_tx$transcript_name, transcriptAnnotation = "symbol", thinBoxFeature = c("UTR"), col = NULL, name = ifelse(tx_id %in% dturtle$sig_tx, " Sig.", ""), rotation.title = 0, background.title = ifelse(tx_id %in% dturtle$sig_tx, "orangered", "transparent"), cex.group = fontsize_vec[[3]], cex.title = fontsize_vec[[3]], cex.axis = fontsize_vec[[3]]) tx_fitted_mean <- grouped_mean_df[tx_id, "diff"] if (!is.na(tx_fitted_mean)) { anno_text_start <- ggplot2::unit(arrow_start, "npc") grobs <- grid::grobTree(grid::textGrob(label = ifelse(tx_fitted_mean > 0, intToUtf8(11014), intToUtf8(11015)), name = "arrow", x = anno_text_start, gp = grid::gpar(fontsize = ceiling(fontsize_vec[[1]] 1.5), col = ifelse(tx_fitted_mean > 0, arrow_colors[[1]], arrow_colors[[2]]))), grid::textGrob(label = paste0(" ", scales::percent(tx_fitted_mean, accuracy = 0.01)), name = "text", x = 2 grid::grobWidth("arrow") + anno_text_start, gp = grid::gpar(fontsize = fontsize_vec[[1]]))) track_annotation <- Gviz::CustomTrack(plottingFunction = function(GdObject, prepare, ...) { if (!prepare) grid::grid.draw(GdObject@variables$grobs) return(invisible(GdObject)) }, variables = list(grobs = grobs)) overlay <- Gviz::OverlayTrack(trackList = list(grtrack, track_annotation)) } else { overlay <- Gviz::OverlayTrack(trackList = list(grtrack)) } overlay@dp@pars <- utils::modifyList(overlay@dp@pars, overlay@trackList[[1]]@dp@pars) grtrack_list <- c(grtrack_list, overlay) } if (reduce_introns) { new_intron_starts <- GenomicRanges::start(reduction_obj$reduced_regions) - cumsum(c(0, GenomicRanges::width(reduction_obj$reduced_regions)[-length(reduction_obj$reduced_regions)])) + cumsum(c(0, reduction_obj$reduced_regions$new_width[-length(reduction_obj$reduced_regions)])) if (length(new_intron_starts) > 0) { grtrack_list <- Gviz::HighlightTrack(trackList = grtrack_list, start = new_intron_starts, width = reduction_obj$reduced_regions$new_width, chromosome = gtf_trans[[1]]@seqnames@values, fill = reduce_introns_fill, col = "white", inBackground = TRUE) } } extension_front <- (max(tx_ranges$end) - min(tx_ranges$start)) max(nchar(tested_tx)) extension_factors[[1]] if (!all(is.na(grouped_mean_df$diff))) { extension_back <- (max(tx_ranges$end) - min(tx_ranges$start)) extension_factors[[2]] } else { extension_back <- 0 } if (!is.null(savepath)) { filename <- file.path(savepath, paste0(make.names(gene), filename_ext)) if (endsWith(filename_ext, ".png")) { args <- list(width = 900, height = 700, filename = filename, res = 160) args <- utils::modifyList(args, list(...)) do.call(grDevices::png, c(args)) } else if (endsWith(filename_ext, ".pdf")) { args <- list(filename = filename, width = 9) args <- utils::modifyList(args, list(...)) if (capabilities("cairo")) { do.call(grDevices::cairo_pdf, c(args)) } else { do.call(grDevices::pdf, c(args)) } } else { args <- list(width = 900, height = 700, filename = filename, quality = 100, res = 160) args <- utils::modifyList(args, list(...)) do.call(grDevices::jpeg, c(args)) } } base_title <- paste0(gene_info$gene_name, ifelse(include_ID_in_title, paste0(" (", gene_info$gene_id, ")"), "")) p <- Gviz::plotTracks(append(track_list, grtrack_list), collapse = TRUE, from = min(tx_ranges$start), to = max(tx_ranges$end), extend.left = extension_front, extend.right = extension_back, title.width = if (any(tested_tx %in% dturtle$sig_tx)) NULL else 0, main = paste0(base_title, " --- ", levels(dturtle$group)[1], " vs. ", levels(dturtle$group)[2]), cex.main = fontsize_vec[[2]]) if (!is.null(savepath)) { grDevices::dev.off() return(args$filename) } else { return(p) } }, BPPARAM = BPPARAM) 2: BiocParallel::bplapply(valid_genes, function(gene) { gene_gtf <- gtf[gtf@elementMetadata[[gtf_genes_column]] == gene, ] gene_info <- as.data.frame(gene_gtf[gene_gtf$type == "gene", ]) if (tested_transcripts_only %in% c("mixed", TRUE)) { tested_tx <- dturtle$FDR_table$txID[dturtle$FDR_table$geneID == gene & !is.na(dturtle$FDR_table$transcript)] if (length(tested_tx) == 0) { if (tested_transcripts_only == "mixed") { tested_tx <- dturtle$FDR_table$txID[dturtle$FDR_table$geneID == gene] } else { message("Skipping ", gene, " --- no transcripts to plot. You may want to adjust the tested_transcripts_only parameter.") return() } } } else { tested_tx <- gene_gtf@elementMetadata[[gtf_tx_column]][gene_gtf$type == "transcript"] } gtf_trans <- gene_gtf[gene_gtf@elementMetadata[[gtf_tx_column]] %in% tested_tx & !gene_gtf$type %in% c("transcript", "gene")] if (length(gtf_trans) == 0) { message("Skipping ", gene, " --- no transcripts to plot. You may want to adjust the tested_transcripts_only parameter.") return() } temp <- Gviz::seqlevels(gtf_trans) temp[!startsWith(temp, "chr")] <- paste0("chr", temp[!startsWith(temp, "chr")]) GenomeInfoDb::seqlevels(gtf_trans) <- temp track_list <- list() grtrack_list <- c() if (!is.null(genome)) { track_list <- append(track_list, ideoTracks[[gene_info$seqnames]]) } if (reduce_introns) { reduction_obj <- granges_reduce_introns(gtf_trans, reduce_introns_min_size) gtf_trans <- reduction_obj$granges } else { track_list <- append(track_list, Gviz::GenomeAxisTrack()) } tx_ranges <- as.data.frame(GenomicRanges::ranges(gtf_trans)) gtf_trans <- split(gtf_trans, gtf_trans@elementMetadata[[gtf_tx_column]]) grouped_mean_df <- get_diff(gene, dturtle) grouped_mean_df <- grouped_mean_df[tested_tx, , drop = FALSE] rownames(grouped_mean_df) <- tested_tx tested_tx <- rownames(grouped_mean_df)[order(abs(grouped_mean_df$diff), decreasing = TRUE)] for (tx_id in tested_tx) { gtf_tx <- gtf_trans[[tx_id]] if (all(c("CDS", "UTR") %in% unique(gtf_tx$type))) { gtf_tx <- gtf_tx[gtf_tx$type != "exon"] } grtrack <- Gviz::GeneRegionTrack(gtf_tx, transcript = gtf_tx$transcript_id, feature = gtf_tx$type, exon = gtf_tx$exon_id, gene = gtf_tx$gene_id, symbol = gtf_tx$transcript_name, transcriptAnnotation = "symbol", thinBoxFeature = c("UTR"), col = NULL, name = ifelse(tx_id %in% dturtle$sig_tx, " Sig.", ""), rotation.title = 0, background.title = ifelse(tx_id %in% dturtle$sig_tx, "orangered", "transparent"), cex.group = fontsize_vec[[3]], cex.title = fontsize_vec[[3]], cex.axis = fontsize_vec[[3]]) tx_fitted_mean <- grouped_mean_df[tx_id, "diff"] if (!is.na(tx_fitted_mean)) { anno_text_start <- ggplot2::unit(arrow_start, "npc") grobs <- grid::grobTree(grid::textGrob(label = ifelse(tx_fitted_mean > 0, intToUtf8(11014), intToUtf8(11015)), name = "arrow", x = anno_text_start, gp = grid::gpar(fontsize = ceiling(fontsize_vec[[1]] 1.5), col = ifelse(tx_fitted_mean > 0, arrow_colors[[1]], arrow_colors[[2]]))), grid::textGrob(label = paste0(" ", scales::percent(tx_fitted_mean, accuracy = 0.01)), name = "text", x = 2 grid::grobWidth("arrow") + anno_text_start, gp = grid::gpar(fontsize = fontsize_vec[[1]]))) track_annotation <- Gviz::CustomTrack(plottingFunction = function(GdObject, prepare, ...) { if (!prepare) grid::grid.draw(GdObject@variables$grobs) return(invisible(GdObject)) }, variables = list(grobs = grobs)) overlay <- Gviz::OverlayTrack(trackList = list(grtrack, track_annotation)) } else { overlay <- Gviz::OverlayTrack(trackList = list(grtrack)) } overlay@dp@pars <- utils::modifyList(overlay@dp@pars, overlay@trackList[[1]]@dp@pars) grtrack_list <- c(grtrack_list, overlay) } if (reduce_introns) { new_intron_starts <- GenomicRanges::start(reduction_obj$reduced_regions) - cumsum(c(0, GenomicRanges::width(reduction_obj$reduced_regions)[-length(reduction_obj$reduced_regions)])) + cumsum(c(0, reduction_obj$reduced_regions$new_width[-length(reduction_obj$reduced_regions)])) if (length(new_intron_starts) > 0) { grtrack_list <- Gviz::HighlightTrack(trackList = grtrack_list, start = new_intron_starts, width = reduction_obj$reduced_regions$new_width, chromosome = gtf_trans[[1]]@seqnames@values, fill = reduce_introns_fill, col = "white", inBackground = TRUE) } } extension_front <- (max(tx_ranges$end) - min(tx_ranges$start)) max(nchar(tested_tx)) extension_factors[[1]] if (!all(is.na(grouped_mean_df$diff))) { extension_back <- (max(tx_ranges$end) - min(tx_ranges$start)) extension_factors[[2]] } else { extension_back <- 0 } if (!is.null(savepath)) { filename <- file.path(savepath, paste0(make.names(gene), filename_ext)) if (endsWith(filename_ext, ".png")) { args <- list(width = 900, height = 700, filename = filename, res = 160) args <- utils::modifyList(args, list(...)) do.call(grDevices::png, c(args)) } else if (endsWith(filename_ext, ".pdf")) { args <- list(filename = filename, width = 9) args <- utils::modifyList(args, list(...)) if (capabilities("cairo")) { do.call(grDevices::cairo_pdf, c(args)) } else { do.call(grDevices::pdf, c(args)) } } else { args <- list(width = 900, height = 700, filename = filename, quality = 100, res = 160) args <- utils::modifyList(args, list(...)) do.call(grDevices::jpeg, c(args)) } } base_title <- paste0(gene_info$gene_name, ifelse(include_ID_in_title, paste0(" (", gene_info$gene_id, ")"), "")) p <- Gviz::plotTracks(append(track_list, grtrack_list), collapse = TRUE, from = min(tx_ranges$start), to = max(tx_ranges$end), extend.left = extension_front, extend.right = extension_back, title.width = if (any(tested_tx %in% dturtle$sig_tx)) NULL else 0, main = paste0(base_title, " --- ", levels(dturtle$group)[1], " vs. ", levels(dturtle$group)[2]), cex.main = fontsize_vec[[2]]) if (!is.null(savepath)) { grDevices::dev.off() return(args$filename) } else { return(p) } }, BPPARAM = BPPARAM) 1: plot_transcripts_view(dturtle = table_UT_sig, gtf = "gencode.vM25.annotation.gtf", genome = "mm10", one_to_one = TRUE, savepath = "images_ut_transcript", add_to_table = "transcript_view")

For specific gene, it works fine. plot_transcripts_view(dturtle = table_UT_sig, genes = "Gm14461", gtf = "gencode.vM25.annotation.gtf", genome = 'mm10', one_to_one = TRUE)

Importing gtf file from disk.

Performing one to one mapping in gtf

Found gtf GRanges for 1 of 1 provided genes.

Fetching ideogram tracks ... Creating 1 plots: $Gm14461 $Gm14461$chr2 Ideogram track 'chr2' for chromosome 2 of the mm10 genome

$Gm14461$OverlayTrack OverlayTrack 'OverlayTrack' containing 2 subtracks

$Gm14461$OverlayTrack OverlayTrack 'OverlayTrack' containing 2 subtracks

$Gm14461$titles An object of class "ImageMap" Slot "coords": x1 y1 x2 y2 chr2 6 55.20000 25.55391 80.28973 OverlayTrack 6 80.28973 25.55391 289.14487 OverlayTrack 6 289.14487 25.55391 498.00000

Slot "tags": $title chr2 OverlayTrack OverlayTrack "chr2" "OverlayTrack" "OverlayTrack"

To Reproduce Steps to reproduce the behavior:

prepare gtf files

tx2gene <- import_gtf(gtf_file = "gencode.vM25.annotation.gtf") tx2gene$gene_name <- one_to_one_mapping(name = tx2gene$gene_name, id = tx2gene$gene_id)

> Changed 110 names.

tx2gene$transcript_name <- one_to_one_mapping(name = tx2gene$transcript_name, id = tx2gene$transcript_id) tx2gene <- move_columns_to_front(df = tx2gene, columns = c("transcript_name", "gene_name"))

import files

list.files("/Users/LiJin/Documents/nBox/Lijin/Manvendra/rsem/") files <- Sys.glob("/Users/LiJin/Documents/nBox/Lijin/Manvendra/rsem/*isoforms.results") files names(files) <- gsub("/Users/LiJin/Documents/nBox/Lijin/Manvendra/rsem/","",files) data <- import_counts(files, type = "rsem", tx2gene=tx2gene[,c("transcript_id", "gene_name")]) rownames(data) <- tx2gene$transcript_name[match(rownames(data), tx2gene$transcript_id)] dim(data) #142604 8 head(data) colnames(data) <- gsub(".isoforms.results","",colnames(data)) pd <- data.frame("id"=colnames(data), "group"=c(rep("Ctr_LPS",4), rep("Ctr_Ut",4),rep("Mut_LPS",4),rep("Mut_Ut",4)), stringsAsFactors = FALSE) pd

DTU analysis

dturtle_UT <- run_drimseq(counts = data, tx2gene = tx2gene, pd=pd, id_col = "id", cond_col = "group", cond_levels = c("Mut_Ut","Ctr_Ut"), filtering_strategy = "bulk")

Retain 28937 of 73504 features.

Removed 44567 features.

head(dturtle_UT$meta_table_gene) dturtle_UT$used_filtering_options

dturtle_UT_sig <- posthoc_and_stager(dturtle = dturtle_UT, ofdr = 0.05, posthoc = 0.1)

Found 27 significant genes with 22 significant transcripts (OFDR: 0.05)

table_UT_sig <- create_dtu_table(dturtle = dturtle_UT_sig, add_gene_metadata = list("chromosome"="seqnames"), add_tx_metadata = list("tx_expr_in_max" = c("exp_in", max))) dim(table_UT_sig$dtu_table)

table_UT_sig <- plot_proportion_barplot(dturtle = table_UT_sig, meta_gene_id = "gene_id.1", savepath = "images_ut_barplot", add_to_table = "barplot")

head(plot_UT_sig$dtu_table$barplot) head(list.files("./images/"))

table_UT_sig <- plot_proportion_pheatmap(dturtle = table_UT_sig, include_expression = TRUE, treeheight_col=20, savepath = "images_ut_heatmap", add_to_table = "pheatmap")

Please complete the following information: -R version 4.1.2 (2021-11-01) -Platform: x86_64-apple-darwin17.0 (64-bit) -Running under: macOS Catalina 10.15.7 -DTUrtle_1.0.1

TobiTekath commented 2 years ago

Hi @lijinw, thank you for reaching out. I am happy to hear that DTUrtle worked for you until now, and I will try my best to resolve your problem.

The initial error message QuartzBitmap_Output - unable to open file 'images_ut_transcript/Acbd5_transcripts.png' tells me, that the DTUrtle function is not able to create the requested image. This could be due to lacking writing permission or just that the requested folder does not exist yet.

Could you ensure, that the folder "images_ut_transcript" exist?

If so, please try if this works without error:

plot_transcripts_view(dturtle = table_UT_sig,
genes = "Gm14461",
gtf = "gencode.vM25.annotation.gtf",
genome = 'mm10',
one_to_one = TRUE,
savepath = "images_ut_transcript")

Please let me know if that resolves your problem. Thanks.

Best, Tobi

P.S.: Normally, DTUrtle should create a save-folder if it is missing. I just noticed all plotting functions do this, except for plot_transcripts_view(). I will push a minor update in the following hour or so.

lijinw commented 2 years ago

Hi @TobiTekath,

Thanks for checking my code. I guess the issue is that I didn't make the folder "images_ut_transcript" in advance. It works now.

Best, Lijin

TobiTekath commented 2 years ago

Hi @lijinw, that is great to hear. I also released DTUrtle 1.0.2 in the last days, which fixes, among other things, that plot_transcripts_view() did not create the missing folder by itself.

Best, Tobi

lijinw commented 2 years ago

Hi @TobiTekath,

Thanks for letting me know.

Best, Lijin