luoyunan / DTINet

A Network Integration Approach for Drug-Target Interaction Prediction
GNU General Public License v3.0
172 stars 60 forks source link

I am confused about X = U * sqrt(sqrt(S)); #1

Closed CosmoChan closed 7 years ago

CosmoChan commented 7 years ago

Excuse me , I am confused about the last line in 'DCA.m'. Why should we use 'sqrt' twice?

luoyunan commented 7 years ago

Hi @CosmoChan This is a performance optimization trick. Note that we first let Q = Q * Q' and then sqrt twice, which recovers the singular values of the original Q (i.e., the one before letting Q = Q * Q'). This reduced the memory footprint from O(Kn^2) to O(n^2), where K is the number of individual networks and n is the number of nodes. This technique allows the algorithm to scale to a large number of networks (i.e., large values of K).

You may also want to refer to the following paper

Cho, H., Berger, B., & Peng, J. (2016). Compact integration of multi-network topology for functional analysis of genes. Cell systems, 3(6), 540-548.

In the "Implementation Details" of the "STAR Method" section, you will find a similar technique that allows one to keep only one individual network in memory at a time, instead of all of the K individual networks.