cytoscape / py4cytoscape

Python library for calling Cytoscape Automation via CyREST
https://Py4Cytoscape.readthedocs.io
Other
69 stars 15 forks source link

py4cytoscape very slow to tranfert a network between a notebook and cytoscape. #54

Closed pkrezel closed 2 years ago

pkrezel commented 3 years ago

Hi,

I have tested py4cystoscape and compare to py2cytoscape, it is very slow to transfert a network between a notebook and cytoscape. Is it normal ?

Sincerely.

Pascal

pkrezel commented 3 years ago

In fact, it works but it is very slow to transfert a small graph.

bdemchak commented 3 years ago

Hi, Pascal ... thanks for trying py4cytoscape. I don't think I'm aware of a slow transfer. Is there something you can share with me so I can assess this further? Also, when you say you're running a notebook, is Jupyter running on your PC or on a server like Google Colab or GenePattern Notebook?

Thanks!

pkrezel commented 3 years ago

Hi, Barry,

Thank you for your answer. My Jupyter notebook is running in local. In fact, I see that there are 2 steps for the transfert on cytoscape. The first one is the transfert without attributes on edges, this one is quick. But the next step, the transfert of the attributes on edges, is very slow.

Pascal

Le mar. 20 juil. 2021 à 00:35, Barry Demchak @.***> a écrit :

Hi, Pascal ... thanks for trying py4cytoscape. I don't think I'm aware of a slow transfer. Is there something you can share with me so I can assess this further? Also, when you say you're running a notebook, is Jupyter running on your PC or on a server like Google Colab or GenePattern Notebook?

Thanks!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cytoscape/py4cytoscape/issues/54#issuecomment-882905791, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMDKG7VAD5FYGTSMGOKV63TYSSDHANCNFSM5ATEOKGQ .

-- Pascal KREZEL SCARTEK Tél :06 86 77 49 27 Adresse: 2 route de Meung, Bizy, 41240 Ouzouer-le-Marché

bdemchak commented 3 years ago

Hi, Pascal ...

Thanks for the information. I have looked more closely at the code. I assume you're using load_table_data() to load attributes, is that right?

Assuming so, this is what happens in load_table_data() ... below.

All steps are fairly fast except for #6. Does your data have any Nan values? How many attribute columns are you trying to load? How many values in each column??

  1. Ask Cytoscape for a list of all keys (e.g., node names) ... I'm assuming you passed the table_key_column= parameter as 'name', or let it just default to 'name' [FAST]

  2. Verify that there is at least one key value in your dataframe that matches a Cytoscape key value [FAST]

  3. Extract just the dataframe rows corresponding to Cytoscape keys that exist [FAST]

  4. For dataframe values that have list values, convert lists to CSVs [FAST]

  5. Convert dataframe to dictionary [FAST]

  6. Convert Nan values to None [SLOW]

  7. Create Cytoscape columns for new attribute values [FAST]

  8. Send attribute values to Cytoscape in bulk [FAST]

bdemchak commented 3 years ago

Hi, Pascal ... any reaction to my comments/questions?

I'll leave this open for a few more days, hoping for more information.

pkrezel commented 3 years ago

Hi Barry,

Excuse-me for not having anwsered quickly to your email. I didn't make any new test and presently I am on holidays. However, I remarked that the transfert of the graph is done in two steps. The first is quick and meanwhile you can optimized the graph but the transfert is not finished. The second step which is low is associated with the download of the attributes.

Sincerely,

Pascal

Le ven. 23 juil. 2021 à 01:13, Barry Demchak @.***> a écrit :

Hi, Pascal ...

Thanks for the information. I have looked more closely at the code. I assume you're using load_table_data() to load attributes, is that right?

Assuming so, this is what happens in load_table_data() ... below.

All steps are fairly fast except for #6 https://github.com/cytoscape/py4cytoscape/pull/6. Does your data have any Nan values? How many attribute columns are you trying to load? How many values in each column??

1.

Ask Cytoscape for a list of all keys (e.g., node names) ... I'm assuming you passed the table_key_column= parameter as 'name', or let it just default to 'name' [FAST] 2.

Verify that there is at least one key value in your dataframe that matches a Cytoscape key value [FAST] 3.

Extract just the dataframe rows corresponding to Cytoscape keys that exist [FAST] 4.

For dataframe values that have list values, convert lists to CSVs [FAST] 5.

Convert dataframe to dictionary [FAST] 6.

Convert Nan values to None [SLOW] 7.

Create Cytoscape columns for new attribute values [FAST] 8.

Send attribute values to Cytoscape in bulk [FAST]

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cytoscape/py4cytoscape/issues/54#issuecomment-885295555, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMDKGZQYVPS7JQ6MD5RVODTZCQ2JANCNFSM5ATEOKGQ .

-- Pascal KREZEL SCARTEK Tél :06 86 77 49 27 Adresse: 2 route de Meung, Bizy, 41240 Ouzouer-le-Marché

bdemchak commented 3 years ago

OK ... thanks, Pascal ... I'll leave this issue open until you can respond.

bdemchak commented 3 years ago

Hi, Pascal --

I can add some information ... I don't see a reason for a slowdown in load_data_table(), but I do see one in create_network_from_data_frames(). Could that be what you're actually calling??

The slowdown comes from an intentional 10 second wait after the network is created. It's to allow Cytoscape's internal data structures to catch up before asking for more changes. The delay can be adjusted via an unsupported py4cytoscape call ... let me know if this would help.

Best ...