Open annaritz opened 1 year ago
The standardized node file seems like the right choice. We could create a helpful util function for the algorithms that don't generate node information that takes in the pathway file and automatically writes a default node output file so that developers don't have to think about it too much. Like write_default_node_scores("pathway.txt", "nodes.txt")
.
Before we proceed, can we outline what other methods (existing or on the SPRAS roadmap) also generate node information so we can think through how it can be used? And how do we envision using node information in downstream tasks? For instance, it would be great to load into our Cytoscape visualizations. Would we load it as a generic "score", or do we need to also track a label saying this score is a node visitation probability.
As @Lyce24 implements a random walk with restarts algorithm, it occurs to us that there may be algorithms that contain useful information for the nodes (in RWR's case, the node visitation probability). Right now, SPRAS standardizes a pathway as an edge list file. Some options we could discuss:
pathway.txt
standardized output file. The information would be redundant - the node info would be written for EVERY edge that the node is incident on - but it would keep the current file format for pathways.