We will process an updated collection of pathway figures published since the last run. We estimate 12,000 pathway figures are published each year in recent years. This would represent a 20% increase in the total pathway count and incorporate the latest research findings into BioThings Explorer via annotated pathway content. Pending performance and demand, we may run the pipeline quarterly during subsequent segments.
28,700 classified as pathway (by AutoML with score >= 0.5) and OCR'd
15,914 new pathways, after deduplication with 20200224 batch (24% increase in 15.5 months, based on dates the queries were performed: 2020-02-24 vs. 2021-05-15)
number of total non-redundant pathways: 79,949 (having at least one recognized gene: 73,876)
number of total gene mentions: 1,326,409 (213,858 from pfocr20210515)
number of total unique genes: 14,251 unique NCBI Gene IDs & 13,465 unique HGNC symbols
We will process an updated collection of pathway figures published since the last run. We estimate 12,000 pathway figures are published each year in recent years. This would represent a 20% increase in the total pathway count and incorporate the latest research findings into BioThings Explorer via annotated pathway content. Pending performance and demand, we may run the pipeline quarterly during subsequent segments.