Closed zietzm closed 6 years ago
starting with computing DWPC matrices for all the metapaths of a certain length, and doing this a certain number of times, on various permuted networks
Yes. DWPC matrixes for Hetionet v1.0 and its five permutations.
I'm not sure how we would go about comparing these matrices with one another.
Not sure yet, but we'll probably be seeing if the DWPC is higher than the P-DWPC (permuted), possibly doing some transformations and statistics.
Where should I start? Do you think that hetmech is the best place for this work, or should we make a new repository for the implementation part of this project? Should the matrices be stored on GitHub?
Let's have you start by working on ways to save matrices to disk and quickly read them in. I will begin work on how we want to compare the DWPC to the P-DWPCs. Let's continue in this repo. At some point we may migrate parts to a new repo or into the hetio package.
Start after https://github.com/greenelab/hetmech/pull/75 is merged with version updates.
@dhimmel I have been thinking about what we discussed regarding the unsupervised approaches we could use in the future. If I understand roughly correctly, what we want to do is implement a system which determines the metapaths showing significant differences from random (permuted) networks, but doing this for all metapaths (of a certain length) in the graph.
It seems like this should be a two-step process, starting with computing DWPC matrices for all the metapaths of a certain length, and doing this a certain number of times, on various permuted networks. All this information would need to be stored for the next step.
This is where I reach the edge of my current knowledge. We will have several matrices for each metapath, with one as the actual values and the others as "controls". I'm not sure how we would go about comparing these matrices with one another. We want to do this without having the actual data, as was done with Rephetio.
Where should I start? Do you think that hetmech is the best place for this work, or should we make a new repository for the implementation part of this project? Should the matrices be stored on GitHub?