Closed spficklin closed 4 years ago
That would be cool. On a related note, the CMX file is currently able to store multiple types of correlations but it isn't used at all by KINC. Is that a feature we want to keep?
KINC v1.0 lets you run multiple correlations at one time. It was useful at the time because we were comparing Pearson/Spearman correlations and it was faster to do them together than to re-run the networks. With KINC v3.0 running much faster I'm not sure we need to support that. It's quick enough to rebuild the network.
Maybe we should implement this feature as a new data type, something like cluster parameter matrix (CPM). It could be an optional output file for the similarity step in case users don't need it, and that way we wouldn't have to modify the CMX or CCM format.
Additionally, this feature could be implemented in a separate analytic from similarity. If you have the CCM file then you can compute the mean and covariance for each cluster from the sample string.
Yeah, I like this idea. Maintains backwards compatibility too. I don't have a preference on either approach.
I've implemented this feature in the cpm-data-type
branch but it's not working quite yet. I get this error when I try to load a CCM:
Data type given for creation of new data object is invalid.
File: ../../src/core/ace_dataobject.cpp
Function: void Ace::DataObject::makeData(const QString&, const QString&)
Line: 693
I'm at a bit of a loss as to why adding the CPM data type threw off the CCM data type. @4ctrl-alt-del can you look at my branch and see if I did anything wrong? Particularly with the data factory and analytic factory. Here's the branch:
https://github.com/SystemsGenetics/KINC/compare/cpm-data-type
I took another look at this branch and was able to fix the issues I had, just pushed to master. KINC now has a export-cpm
analytic which takes EMX/CCM input files and produces a CPM file which can be viewed with qkinc.
It would be really useful if we could save, for each cluster, the center point (mean) and the variance matrix. This would add 6 floating point numbers to the cmx file.
Benefits