Closed saisomesh2594 closed 4 years ago
While a metadata-file is strictly not necessary to perform hierarchical clustering, it is required for tSNE and for clustering based on groups. All you need is one column that corresponds to the sample IDs you've used when creating the distance matrix (e.g. SRR
, which is the default), and then an arbitrary number of other columns containing whatever info is relevant for you. You might include information related patient ID, as was done in the publication of VarClust, for example.
The simplest metadata file is thus only two columns: one ID column corresponding to the IDs used for creating the distance matrix, and one column containing some kind of grouping information. I have now updated the documentation to better explain this.
Thanks for the clarification. Worked like a charm!
Hi again,
So, following your previous advice, I have successfully managed to generate profiles for my VCF files and the distance matrix as well.
However, now coming to plotting the heatmap and tSNE clustering, I see that a metadata file is required which specifies the column id to merge the distance matrix and metadata file on as well as for coloring and shapes (in case of tSNE plotting). I have tried to guess what the metadata file might look like by browsing through the code, but, I have been unsuccessful.
Could you kindly share a snippet of how the metadata file must look like ? And any other info I should be aware of before plotting the heatmaps and the clustering ?
Thanks, Somesh