Open peterk87 opened 6 years ago
This is a good idea. By cgMLST I assume all the allele types? We have this info so could definitely export it to a separate worksheet/file.
I think it's something that can be worked on for the next release. Maybe along with building a dendrogram. I'd be happy to see your code for this :smile:
Here's a potentially useful notebook with some code to generate distance matrices, dendrograms, MSTs, flat clusters from distance matrices: https://gist.github.com/peterk87/b203f62a71d7f4fb273139b219af5e81
Would it be possible to output the cgMLST results to a separate CSV/TSV file and/or XLSX worksheet?
You could get really fancy and produce a dendrogram from the cgMLST results, output to newick, output a distance matrix of the different profiles, etc (I have performant code that does all this if you'd like to add these features).
It'd be nice for users to use the cgMLST info along with their SISTR serovar/subspecies predictions to view everything in something like Phandango or Microreact (basically tree + metadata table) or to do their own analyses with the complete suite of information available from SISTR.