cytoscape / py4cytoscape

Python library for calling Cytoscape Automation via CyREST
https://Py4Cytoscape.readthedocs.io
Other
69 stars 15 forks source link

get_table_columns does not get all columns #98

Open kozo2 opened 1 year ago

kozo2 commented 1 year ago

I tried p4c.get_table_columns(table='node', network=1162) for the second netowrk in this cys file. https://www.dropbox.com/s/r9s09p4p54naw7z/test.cys?dl=0 But it didn't get the all columns It got only the first 10 columns. Please let me know if you have any ideas about this. I'm using Python 3.10.8 + py4cytoscape 1.6.0 on Windows11.

bdemchak commented 1 year ago

Hi, Kozo ... I get 28 columns when I do this using Python 3.10.4 + py4cytoscape 1.6.0 + Windows 10. Also, I'm using Pandas version 1.4.2. All is fine on my system.

Would you mind trying a few things?

1) Call p4c.get_table_column_types on this network/table ... good to see how many columns you get.

2) In Swagger (via Help | Automation | CyREST API), try /v1/networks/{networkId}/tables/defaultnode/columns ... good to see what Cytoscape reports.

Has this failure been happening long? Do you know what changed between now and the last time this worked?

Thanks!

kozo2 commented 1 year ago

Hi Barry, I'm sorry. The method I told you certainly got all the columns as you say. It's because I partially told you how to reproduce the problem. I will create a more detailed description to reproduce the problem. Give me some time until then.

bdemchak commented 1 year ago

Thanks, Kozo ... I'm grateful for your help and looking forward to what you find.

... sent from my mobile phone ... please forgive my typos.

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv : Barry Demchak, PhD : : Torrey Pines Software : : http://www.tpsoft.com/ : : http://orcid.org/0000-0001-7065-7786 : : (858) 452-8700 : : (619) 218-3717 (cell/text) : vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

On Thu, Dec 15, 2022, 5:40 AM Kozo Nishida @.***> wrote:

Hi Barry, I'm sorry. The method I told you certainly got all the columns as you say. It's because I partially told you how to reproduce the problem. I will create a more detailed description to reproduce the problem. Give me some time until then.

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/cytoscape/py4cytoscape/issues/98*issuecomment-1353085331__;Iw!!Mih3wA!AgAqxmtchrM7euomg-9cn-uJgxUq5_oHfdztQlkfUqToPMVscvVauqnuIvauevOCUGR5zj9px7WDow_XM0sjm6fy2g$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AA4GLXQCW42Y4ULH3FR6EXDWNMNV5ANCNFSM6AAAAAAS5JD23U__;!!Mih3wA!AgAqxmtchrM7euomg-9cn-uJgxUq5_oHfdztQlkfUqToPMVscvVauqnuIvauevOCUGR5zj9px7WDow_XM0sGUmHh4w$ . You are receiving this because you commented.Message ID: @.***>

kozo2 commented 1 year ago

Thanks Barry. Please ignore about the cys file. Below is the answer to your question number 1 when the p4c.get_table_columns doesn't work.

  1. Call p4c.get_table_column_types on this network/table ... good to see how many columns you get.

It returns the all 28 columns. Below is the screenshot.

image image

I tried the keys() of the p4c.get_table_column_types result for get_table columns(columns= argument. But it did not get the 28 columns.

image

kozo2 commented 1 year ago

And this is the answer for your question number 2.

In Swagger (via Help | Automation | CyREST API), try /v1/networks/{networkId}/tables/defaultnode/columns ... good to see what Cytoscape reports.

Curl

curl -X GET --header 'Accept: application/json' 'http://localhost:1234/v1/networks/1162/tables/defaultnode/columns'

Request URL

http://localhost:1234/v1/networks/1162/tables/defaultnode/columns

Response Body

[
  {
    "name": "SUID",
    "type": "Long",
    "immutable": true,
    "primaryKey": true
  },
  {
    "name": "shared name",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "name",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "selected",
    "type": "Boolean",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_X",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_Y",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_WIDTH",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_HEIGHT",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_LIST_FIRST",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_LIST",
    "type": "List",
    "listType": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_ID",
    "type": "List",
    "listType": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_LABEL_COLOR",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_FILL_COLOR",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_REACTIONID",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_TYPE",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_NODE_SHAPE",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "KEGG_LINK",
    "type": "String",
    "immutable": true,
    "primaryKey": false
  },
  {
    "name": "row.names",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "x_location",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "y_location",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "source",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "target",
    "type": "Integer",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "orig_edge_SUID",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "shared interaction",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "interaction",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "KEGG_REACTION_TYPE",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  },
  {
    "name": "KEGG_REACTION_GENE",
    "type": "String",
    "immutable": false,
    "primaryKey": false
  }
]
kozo2 commented 1 year ago

I think I find the key information to the problem. (See the attached image.) It seems that if there are no elements in the column, it stops acquiring subsequent column information.

image

kozo2 commented 1 year ago

When I skipped the 10th column, I got a strange result. I could not get a data frame. image

bdemchak commented 1 year ago

Thanks, Kozo –

I’m travelling today, so can’t do much to duplicate this … I’ll look into this as soon as I can.

From: Kozo Nishida @.> Sent: Thursday, December 15, 2022 10:44 AM To: cytoscape/py4cytoscape @.> Cc: Barry Demchak @.>; Comment @.> Subject: Re: [cytoscape/py4cytoscape] get_table_columns does not get all columns (Issue #98)

And this is the answer for your question number 2.

In Swagger (via Help | Automation | CyREST API), try /v1/networks/{networkId}/tables/defaultnode/columns ... good to see what Cytoscape reports.

Curl

curl -X GET --header 'Accept: application/json' 'http://localhost:1234/v1/networks/1162/tables/defaultnode/columns https://urldefense.com/v3/__http:/localhost:1234/v1/networks/1162/tables/defaultnode/columns__;!!Mih3wA!AHE8ZUAjoikzIcnkd4FYXCvXtBGvSL-iBB5l9RAMaiisFrKcpQWcWMknv7hqAXwzoFfKk6A7JsR10pkGdqSLgCjI-g$ '

Request URL

http://localhost:1234/v1/networks/1162/tables/defaultnode/columns https://urldefense.com/v3/__http:/localhost:1234/v1/networks/1162/tables/defaultnode/columns__;!!Mih3wA!AHE8ZUAjoikzIcnkd4FYXCvXtBGvSL-iBB5l9RAMaiisFrKcpQWcWMknv7hqAXwzoFfKk6A7JsR10pkGdqSLgCjI-g$

Response Body

[ { "name": "SUID", "type": "Long", "immutable": true, "primaryKey": true }, { "name": "shared name", "type": "String", "immutable": true, "primaryKey": false }, { "name": "name", "type": "String", "immutable": true, "primaryKey": false }, { "name": "selected", "type": "Boolean", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_X", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_Y", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_WIDTH", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_HEIGHT", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_LABEL", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_LABEL_LIST_FIRST", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_LABEL_LIST", "type": "List", "listType": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_ID", "type": "List", "listType": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_LABEL_COLOR", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_FILL_COLOR", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_REACTIONID", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_TYPE", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_NODE_SHAPE", "type": "String", "immutable": true, "primaryKey": false }, { "name": "KEGG_LINK", "type": "String", "immutable": true, "primaryKey": false }, { "name": "row.names", "type": "Integer", "immutable": false, "primaryKey": false }, { "name": "x_location", "type": "String", "immutable": false, "primaryKey": false }, { "name": "y_location", "type": "String", "immutable": false, "primaryKey": false }, { "name": "source", "type": "Integer", "immutable": false, "primaryKey": false }, { "name": "target", "type": "Integer", "immutable": false, "primaryKey": false }, { "name": "orig_edge_SUID", "type": "String", "immutable": false, "primaryKey": false }, { "name": "shared interaction", "type": "String", "immutable": false, "primaryKey": false }, { "name": "interaction", "type": "String", "immutable": false, "primaryKey": false }, { "name": "KEGG_REACTION_TYPE", "type": "String", "immutable": false, "primaryKey": false }, { "name": "KEGG_REACTION_GENE", "type": "String", "immutable": false, "primaryKey": false } ]

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https:/github.com/cytoscape/py4cytoscape/issues/98*issuecomment-1353292527__;Iw!!Mih3wA!AHE8ZUAjoikzIcnkd4FYXCvXtBGvSL-iBB5l9RAMaiisFrKcpQWcWMknv7hqAXwzoFfKk6A7JsR10pkGdqSOSXRzRA$ , or unsubscribe https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AA4GLXXX7AQF67N5TDJYU23WNM4CNANCNFSM6AAAAAAS5JD23U__;!!Mih3wA!AHE8ZUAjoikzIcnkd4FYXCvXtBGvSL-iBB5l9RAMaiisFrKcpQWcWMknv7hqAXwzoFfKk6A7JsR10pkGdqQmFZA4Ew$ . You are receiving this because you commented. https://github.com/notifications/beacon/AA4GLXXFSHEIOHOG3YPS5QTWNM4CNA5CNFSM6AAAAAAS5JD23WWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTSQVGNO6.gif Message ID: @. @.> >

bdemchak commented 1 year ago

Hi, Kozo ...

Finally, I'm able to look at this closely. Sorry for the delay.

The problem I see is with the error message "Column "KEGG_NODE_LABEL_LIST" has only 3321 elements, but should have 4637"

For background, get_table_columns() fetches a list of SUIDs for the table, and it uses the length of the list to determine how many data elements each column should have.

Then, for each column, all column values are fetched. When CyREST returns the values, it returns them as a simple list, and doesn't tag the values with their SUIDs. So, if there are fewer values than SUIDs, there's no way to know which values go with which SUIDs. That's why get_table_columns() gives an error when it finds a column that doesn't have the same number of values as SUIDs.

From the screen shot, it looks like you're querying the "Metabolic pathways [rno01100]" table, which has 4637 nodes. It also looks like you're asking for the KEGG_NODE_LABEL_LIST column, whose datatype is List of Strings. Looking at the table with Cytoscape, I see that some values are empty ([]) and some are not (e.g., [C20408]).

Am I seeing this right?

When I try get_table_columns() with this data, I see what I expect ... that KEGG_NODE_LABEL_LIST column contains 4637 values, where each value is a Python list, and each list has a single element (e.g., [''] or ['C20408']). All good.

The error you're reporting indicates that only 3321 values were found. In my debugging, I see all 4637 values. So, I don't get the same result as you do.

I do notice that if I remove all of the empty values (i.e, ['']), there are 3321 values remaining, which matches what the error is reporting.

So, somehow, the empty values are being eliminated either in get_table_columns() or before.

The quickest test would be to use Swagger to fetch the values and see what CyREST is actually returning. Use the GET /v1/networks/{networkId}/tables/defaultnode/columns/KEGG_NODE_LABEL_LIST function for this.

The CURL for this would be:

curl -X GET --header 'Accept: application/json' 'http://localhost:1234/v1/networks/124/tables/defaultnode/columns/KEGG_NODE_LABEL_LIST'

... where {networkId} or "124" is the SUID for the "Metabolic pathways [rno01100]' network. (You can use Cytoscape to find the network SUID by selecting the Network Table, then adding the "SUID" column to the visible column list, and then seeing the SUID value in the table.

I would want to know all of the values being returned, and am especially interested in knowing whether there are 3321 values or 4637 values. And I'm also very interested to know whether the empty values are actually [''] or something else.

Thanks!