wikipathways / pathway-figure-ocr

Extracting gene sets from published pathway figures
Apache License 2.0
15 stars 2 forks source link

Support expansion and filtering operations in BTE through the PFOCR API with expanded concept extraction #15

Closed AlexanderPico closed 3 years ago

AlexanderPico commented 4 years ago

The PFOCR data can be used to either expand or filter query paths. Small gene sets along query paths can be expanded to include all co-members in published pathway figures, and large gene sets can be filtered down into more biologically focused sets.

AlexanderPico commented 4 years ago

@ariutta Please provide an updated JSON with PMIDs to improve mapping to other BTE entities.

AlexanderPico commented 4 years ago

@andrewsu Do you need our input on the API design for this item? In addition to providing PMID links, do you need any other information in the JSON?

andrewsu commented 4 years ago

This one is going to mostly be work on the BTE end I think -- the data you've provided in the JSON (after adding PMIDs) I think will be sufficient for a first pass at this. We'll need to prioritize this on our end...

I think if you all could focus your efforts on #16 and #17, that would be great. And as a stretch goal, I'd be really excited about adding NER for other object types (compounds, diseases, etc.), which we originally planned for for Year 2...

AlexanderPico commented 4 years ago

Per suggestion, let's shift the focus of this item to adding NER for other object types (compounds, diseases, etc.). @ariutta Let's explore piping our current OCR results to PubTator's web API for compounds and concepts, for example.

AlexanderPico commented 3 years ago

The progress on this task has conceptually merged with #16