cytoscape / py4cytoscape

Python library for calling Cytoscape Automation via CyREST
https://Py4Cytoscape.readthedocs.io
Other
70 stars 15 forks source link

Add semantics parameters to import_network_from_file() #30

Closed bdemchak closed 3 years ago

bdemchak commented 3 years ago

I used p4c.commands_post() to import a network file that has source, target, and interaction columns that need identification. I could not use import_network_from_file() because it has no provisions for skipping rows, specifying a header row or identifying column meanings. Additionally, this meant that I had to specify the full path to the file by determining the sandbox folder path and appending the file name. For import_network_from_file(), py4cytoscape would have hidden this. This problem occurs whenever commands_post() must pass a file name, and is a useful enough pattern that it should exist in documentation or a cookbook.

Recommendation: Add parameters to import_network_from_file() to better specify file semantics.

bdemchak commented 3 years ago

Add semantics parameters to import_network_from_file() #30

bdemchak commented 3 years ago

Optimize functions that operate with Sandbox send_to and get_from functions #31

bdemchak commented 3 years ago

@AlexanderPico @yihangx

I'm implementing a solution for a couple of issues encountered while implementing the Gang Su workflows.

Basically, we don't have enough parameters for the networks.import_network_from_file() and tables.load_table_data() to import the workflow network and annotation files. To get around this, I have to use commands_post and directly call the corresponding Command.

For import_network_from_file, this can be solved by adding firstRowAsColumnNames=false, startLoadRow=1, indexColumnSourceInteraction=1, indexColumnTargetInteraction=3, and indexColumnTypeInteraction=2, and delimiters=[',', '\t'] ... and maybe columnTypeList="s,i,t".

The example file for this is supplementary_tableS4.txt at https://github.com/bdemchak/cytoscape-jupyter/blob/main/gangsu/Barabasi/supplementary_tableS4.txt ... the actual parameters for this file would be: firstRowAsColumnNames=true startLoadRow=2 indexColumnSourceInteraction=2 indexColumnTargetInteraction=4 indexColumnTypeInteraction=5 columnTypeList="x,s,x,t,i"

Another example file is disease.net.txt at https://github.com/bdemchak/cytoscape-jupyter/blob/main/gangsu/Barabasi/disease.net.txt ... the actual parameters would be firstRowAsColumnNames=true delimiters=" " startLoadRow=1 indexColumnSourceInteraction=1 indexColumnTargetInteraction=2 columnTypeList="s,t,ea"'

The "network load file" command can accept a number of other parameters (e.g., networkRenderList, etc). It's easy to see how some of these could be useful, and others never useful. For now, I'm proposing just adding the parameters needed to make the Gang Su workflow function easily.

By doing this and avoiding the commands_post call, we also get rid of an unnecessary concatenation of the sandbox path and file name, which import_network_from_file() does for us already.

Similarly, the workflow needs to load annotations from a table file. Right now, the only feature for this is tables.load_table_data, which accepts a dataframe, not a file. To use a dataframe requires reading the file into the dataframe and then doing some rearranging that is already done in Cytoscape's "table import file" command.

For this, I propose adding a tables function analogous to the networks.import_network_from_file() ... call it tables.load_table_data_from_file().

For example, the file GDS112_full.soft in https://www.dropbox.com/s/r15azh0xb53smu1/GDS112_full.soft?dl=0. I need to specify startLoadRow="83" and keyColumnIndex="10".

Right now, I need to call the command explicitly via commands_post.

So, that's two proposals:

Agreed??

bdemchak commented 3 years ago

@AlexanderPico @yihangx @bdemchak

Oh ... and adding the load_table_data_from_file() function would also allow getting rid of concatenating a sandbox path with a file name ... this can be rolled into the function, just as is the case with other file accessing functions.

bdemchak commented 3 years ago

import_network_from_file() is complete, with somewhat different (but appropriate) parameter extension ... working on load_table_data_from_file() next.