KarrLab / datanator_rest_api

A OAS3 compliant REST API for the Datanator integrated database
MIT License
0 stars 3 forks source link

Add endpoints that return stats about the new data types #98

Closed jonrkarr closed 4 years ago

jonrkarr commented 4 years ago

We need to update/add some endpoints to provide the remaining content for the stats page. See also KarrLab/datanator_frontend#231.

lzy7071 commented 4 years ago
jonrkarr commented 4 years ago

If its simpler, the stats could all be returned in one endpoint

jonrkarr commented 4 years ago

proteins/summary/num_obs_modifications isn't working (503) error.

lzy7071 commented 4 years ago

Number of observations from each data source (similar to metabolites/summary/ecmdb_doc_count/, reactions/summary/num_entries/, and `metabolites/summary/ymdb_doc_count/)

lzy7071 commented 4 years ago

proteins/summary/num_obs_modifications isn't working (503) error.

It was a Heroku timeout issue. I have since fixed by giving the MongoDB driver an index hint.

EDIT: Nevermind, the index still wasn't used

jonrkarr commented 4 years ago

proteins/summary/num_obs_modifications

This still doesn't seem to be working.

Protein Ontology Do we have this information?

This is the source of the protein modifications.

https://testapi.datanator.info/reactions/summary/get_brenda_obs/?parameter={}.

The parameter should have a third allowed value k_is (type/sbo_type 261)

jonrkarr commented 4 years ago

We also need to retrieve the number of articles with metabolite concentrations. Do you mean the total number of articles, including primary sources from ECMDB and YMDB, or just the self-parsed sources?

I'm trying to create three graphs:

For metabolite concentrations we have an endpoint for the first graph (Number of observations of each data type): metabolites/summary/concentration_count/

For the second graph (Number of observations that come from each source), we have metabolites/summary/ecmdb_doc_count/ and metabolites/summary/ymdb_doc_count/. The grouping of observations into entries in databases is somewhat arbitrary. Rather than counting the number of database entries, it would be good to report the number of concentration measurements that come from ECMDB and YMDB.

For the third graph, it would be good to count the number of articles that provide metabolite concentration data {Number of unique DOIs/PubMed Ids in ECMDB} + {Number of unique DOIs/PubMed Ids in YMDB} + {Number of articles that you curated}

lzy7071 commented 4 years ago

Number of primary sources of each data type (similar to proteins/summary/num_publications/, rna/summary/get_distinct/?_input=halflives.reference.doi

lzy7071 commented 4 years ago

https://testapi.datanator.info/reactions/summary/get_brenda_obs/?parameter={}.

The parameter should have a third allowed value k_is (type/sbo_type 261)

We actually never parsed k_is from BRENDA, if I am not mistaken https://github.com/KarrLab/datanator/blob/ce0c54367f68c0cd7f2a9571857c93f75e859976/datanator/data_source/brenda/core.py#L169.

lzy7071 commented 4 years ago

Protein Ontology Do we have this information?

This is the source of the protein modifications.

In that case: https://testapi.datanator.info/proteins/summary/num_obs_modifications/ But I'm still trying to get it to not timeout.

jonrkarr commented 4 years ago

We actually never parsed k_is from BRENDA, if I am not mistaken

There are Kis parsed from SABIO-RK. Here's an example. I added this to the reaction kinetics data table last week after I noticed that the REST API was returning Kis.

lzy7071 commented 4 years ago

... it would be good to report the number of concentration measurements that come from ECMDB and YMDB.

Total concentration measurements: https://testapi.datanator.info/metabolites/summary/concentration_count/ YMDB: http://testapi.datanator.info/metabolites/summary/ymdb_conc_count/ ECMDB: http://testapi.datanator.info/metabolites/summary/ecmdb_conc_count/

lzy7071 commented 4 years ago

Protein Ontology Do we have this information?

This is the source of the protein modifications.

In that case: https://testapi.datanator.info/proteins/summary/num_obs_modifications/ But I'm still trying to get it to not timeout.

https://testapi.datanator.info/proteins/summary/num_obs_modifications/ now statically returns an integer

lzy7071 commented 4 years ago

{Number of unique DOIs/PubMed Ids in ECMDB} + {Number of unique DOIs/PubMed Ids in YMDB} + {Number of articles that you curated}

lzy7071 commented 4 years ago

We actually never parsed k_is from BRENDA, if I am not mistaken

There are Kis parsed from SABIO-RK. Here's an example.

My bad, I thought you only wanted parameters from BRENDA. http://testapi.datanator.info/reactions/summary/get_brenda_obs/?parameter=k_is is now up.

jonrkarr commented 4 years ago

My bad, I thought you only wanted parameters from BRENDA. http://testapi.datanator.info/reactions/summary/get_brenda_obs/?parameter=k_is is now up.

Sorry for the confusion. You're correct that we only have Kis from SABIO-RK and have ignored Kis from BRENDA. Since we're now display KIs, we could include KIs from BRENDA as well.

For SABIO-RK, can you return the number of kinetic measurements (kcat, KM, and Ki)? This can be returned as a sum or the individual counts.

lzy7071 commented 4 years ago

For SABIO-RK, can you return the number of kinetic measurements (kcat, KM, and Ki)? This can be returned as a sum or the individual counts.

https://testapi.datanator.info/reactions/summary/get_sabio_obs/?parameter={}

jonrkarr commented 4 years ago

The numbers for reactions/summary/get_sabio_obs/ seem too low. Can you double check this? The numbers should be on the order of 20,000-40,000.

lzy7071 commented 4 years ago

The numbers for reactions/summary/get_sabio_obs/ seem too low. Can you double check this? The numbers should be on the order of 20,000-40,000.

It should be good now. Sorry I am still experimenting with aggregate in MongoDB.

jonrkarr commented 4 years ago

No worries. It looks right now.

jonrkarr commented 4 years ago

I think we have all of the needed endpoints now.