Closed zietzm closed 6 years ago
Another package of interest could be xarray
, which provides "N-dimensional variants of the core pandas data structures." I think it could be ideal for matrix based representations of hetnets. However currently it can only use numpy arrays as its backend and lacks scipy.sparse
support. Therefore it's probably not appropriate at this time.
Check out 2.xarray.ipynb where we encode Hetionet v1.0 as an xarray.
The following products are related and may be good for storing dataframes and potentially matrices on disk:
For storing nodes for hetmech, I'm thinking we should use sqlite, to enable fast lookup of node positions from names. However, not sure if we should do this now in https://github.com/greenelab/hetmech/pull/97 or later.
Closed by #97
I recently came across multinetx
(GitHub), which is a:
python package for the manipulation and visualization of multilayer networks. The core of this package is a MultilayerGraph, a class that inherits all properties from networkx.Graph().
I don't think we have any use for this package at the moment, but I wanted to note it here, so we can keep an eye on its development.
We are considering creating another base representation of hetnets. One of the main goals is to facilitate faster network loading, which at present can take over a minute and a half to load a graph.
The following are under consideration:
.npz
format. Moreover, unlike the other methods, we would likely not have to change more than a few functions in order to load on-disk adjacency matrices.