Closed Gemma-Zhang-326 closed 11 months ago
Hi @Gemma-Zhang-326, thank you for using it. Yes, TADcompare for timecourse needs at least 4 samples, to distinguish "early" and "late" changing boundaries. If you have an example code that could reproduce the issue you described, I'll look into it.
For 3 samples, I recommend calling boundaries in each sample using SpectralTAD and then do pairwise comparisons using findOverlaps. Potentially setting the maxgap
parameter to the resolution of your data, to avoid boundaries being one bin apart being called as differential (if boundaries are adjacent, chances are it's a technical artifact and they may be the same boundary).
Sorry, I don't understand your concern regarding the increase in the false positive rate during the comparison of 3 samples. I assumed that I could still perform pairwise comparisons using the TADCompare
function to identify different TAD boundaries by finding overlaps. In the result of the execution of TimeCompare
, why can't we just do a simple classification based on the boundary scores? FYI I ask in this way because of my limited coding skills.
You can do the analysis using 3 samples as follows:
library(TADCompare)
data("time_mats")
time_mats[[4]] <- NULL # Remove 4th matrix
time_var <- TimeCompare(time_mats, resolution = 50000)
time_var$TAD_Bounds
Coordinate Sample 1 Sample 2 Sample 3 Consensus_Score Category
1 17350000 3.8501876 2.47907505 3.1996876 3.1996876 Dynamic TAD
2 18800000 2.0859417 -0.03668098 7.2863302 2.0859417 Late Appearing TAD
3 18850000 0.6752193 6.73142477 -0.8310747 0.6752193 Dynamic TAD
4 20700000 1.6578767 3.27533627 3.0810224 3.0810224 Early Appearing TAD
Be careful with the "Category" column, its classifications are less accurate. I'd suggest visualizing and clustering the boundary scores, and define boundary behavior from these clusters. Like this:
library(pheatmap)
mtx_to_plot <- time_var$TAD_Bounds[, c("Sample 1", "Sample 2", "Sample 3")]
rownames(mtx_to_plot) <- time_var$TAD_Bounds$Coordinate
annotation_row <- data.frame(Category = time_var$TAD_Bounds$Category)
rownames(annotation_row) <- time_var$TAD_Bounds$Coordinate
p <- pheatmap(mtx_to_plot, cluster_cols = FALSE, scale = "row", annotation_row = annotation_row, cutree_rows = 6)
p.clust <- cbind(mtx_to_plot, cluster = cutree(p$tree_row, k = 6))
You can do the analysis using 3 samples as follows:
library(TADCompare) data("time_mats") time_mats[[4]] <- NULL # Remove 4th matrix time_var <- TimeCompare(time_mats, resolution = 50000) time_var$TAD_Bounds
Coordinate Sample 1 Sample 2 Sample 3 Consensus_Score Category 1 17350000 3.8501876 2.47907505 3.1996876 3.1996876 Dynamic TAD 2 18800000 2.0859417 -0.03668098 7.2863302 2.0859417 Late Appearing TAD 3 18850000 0.6752193 6.73142477 -0.8310747 0.6752193 Dynamic TAD 4 20700000 1.6578767 3.27533627 3.0810224 3.0810224 Early Appearing TAD
Be careful with the "Category" column, its classifications are less accurate. I'd suggest visualizing and clustering the boundary scores, and define boundary behavior from these clusters. Like this:
library(pheatmap) mtx_to_plot <- time_var$TAD_Bounds[, c("Sample 1", "Sample 2", "Sample 3")] rownames(mtx_to_plot) <- time_var$TAD_Bounds$Coordinate annotation_row <- data.frame(Category = time_var$TAD_Bounds$Category) rownames(annotation_row) <- time_var$TAD_Bounds$Coordinate p <- pheatmap(mtx_to_plot, cluster_cols = FALSE, scale = "row", annotation_row = annotation_row, cutree_rows = 6) p.clust <- cbind(mtx_to_plot, cluster = cutree(p$tree_row, k = 6))
Thank you for your prompt response and kind assistance! Your help was greatly appreciated and it made a big difference!I will continue using this tool for my studies. Thank you for your effort. Have a great day!
Hi there! I'm thrilled to use your tool for analyzing my data. I'm trying to do a time-course data analysis using the
Timecompare
function. Unfortunately, I only have three time points, and your documentation says that the function requires at least four time points. As a result, the output I'm getting is unusual. I did notice that the output oftime_var$TAD_Bounds
seems to occupy the boundary scores of samples themselves, but I'm not entirely sure if I understood your documentation correctly.I have a few questions. Can I differentiate boundaries based on boundary scores by comparing only three samples manually? Also, could you provide some guidance on differentiation thresholds? For instance, I want to differentiate boundaries that appear or disappear specifically in the first time period, as well as boundaries that appear or disappear in the first two time periods. I would greatly appreciate your help!