reactome / analysis-service

A RESTful API to access pathway analysis tools
https://reactome.org/AnalysisService/
4 stars 3 forks source link

ratio != num_found/num_total #1

Closed dvklopfenstein closed 5 years ago

dvklopfenstein commented 5 years ago

Thank you very much for the excellent Reactome tools. I am preparing to use them on my data.

Upon preparation, I see that this is frequently true not only in the results from the analysis service, but also from the Pathway Analysis GUI:

Entities_ratio is often not equal to num_Entities_found/num_Entities_total
Reactions_ratio is often not equal to num_Reactions_found/num_Reactions_total

Is the ratio supposed to be calculated using num_found and num_total? Or is the ratio some other equation? If so, what is the equation?

Thank you again for the Reactome database and the tools to access the information within.

fabregat commented 5 years ago

Dear Klopfenstein,

The ratio is not calculated as num_found / num_total, it means "how big the pathway is, compared to the number of entities in the species".

Note: The same for the reactions ratio, it tells you how big the pathway is in number of reactions, compared with that for the species.

Thank you for your interest in using Reactome data and software.

dvklopfenstein commented 5 years ago

Thank you for responding so fast.

The ratio has extremely useful information. I did not know that. Using the content service, how can I download the list of entities and reactions found in a particular species?

Very nice tools. Thank you for creating them.

fabregat commented 5 years ago

To get the participants for a given species with the content service, you have to query the identifiers for each pathway with the appropriate method under https://reactome.org/ContentService/#/participants

To speed the exercise up, what about downloading one of the identifiers mapping files at https://reactome.org/download-data ?

Just choose the file that better fits your purpose and then get the identifiers for a specific species by filtering the lines based on the pathway identifier prefix (for example, use “R-HSA-“ for human) and then get the unique entities identifiers.

Another option is to use the graph database. Would this be within your options? If so, have a look to https://reactome.org/dev/graph-database/extract-participating-molecules