NaegleLab / CoDIAC

Other
0 stars 0 forks source link

Improve other species fetch of InterPro records #17

Closed knaegle closed 1 year ago

knaegle commented 1 year ago

Is your feature request related to a problem? Please describe. Currently, default Interpro fetch of Uniprot records containing an InterPro domain is rigid and only returns reviewed records, meaning that only very well studied organisms will return records (e.g. for SH2 470 proteins out of 136K possible). We would like to give users the ability to accept unreviewed records and to query specific taxonomies.

Describe the solution you'd like Add the capability to query taxonomy (i.e. get information about domains in different taxa) Add the capability to turn on or off the Reviewed flag Add the capability to return records based on taxonomy id or to get all records at once.

Tasks

Include specific tasks in the order they need to be done in. Include links to specific lines of code where the task should happen at.

knaegle commented 1 year ago

Found the fastest way to species handling was searching in the API by a search filter for species scientific name. This will report an error is likely if fetch failed to return results in case the scientific name is an issue. This now also expands capability of dissecting other details about the uniprot ids.

See example of searching a different species for records, all records are unreviewed.

Screenshot 2023-08-05 at 10 01 25 AM