PrincetonUniversity / DP_GP_cluster

BSD 3-Clause "New" or "Revised" License
76 stars 26 forks source link

apply on data other than expression #26

Closed johaGL closed 3 years ago

johaGL commented 3 years ago

Hello, thank you for your great software. I would like to use it on two datasets, one has 4 time points, the other only 3 time points. I have a list (unfiltered) of "highly dynamic genes" with respective Log2FoldChange and pvalue between each two consecutive time points (DESeq2). I have 3 biological replicates for each time point. My question is:

IanMcDowell commented 3 years ago

You do not have to use (log) TPM data. In fact, log2FC is preferred. However, this tool may not be the right one for your application. Because you only have three timepoints, modeling trajectories with a Gaussian process is overkill. Hierarchical clustering should be sufficient for this task, and the resultant dendrogram and heatmap will be helpful exploratory tools.

As an aside, it is convention to compare each timepoint to the baseline timepoint instead of comparing each timepoint to its immediately preceding timepoint. This will yield more readily interpretable and plottable values. (For example, under your scheme, plot out if a gene has log2FC 1vs2 = 2 and 2vs3 = 0, then plot the same values under different contrasts, 1vs2 = 2 and 1vs3 = 2.)

johaGL commented 3 years ago

Thank you very much, sorry for my late reply. We appreciate equally the second advice you gave us. Best regards.