Coleridge-Initiative / RCGraph

Rich Context knowledge graph management
https://rc.coleridgeinitiative.org/?radius=3&entity=NOAA
Creative Commons Zero v1.0 Universal
3 stars 2 forks source link

analysis: publisher classifier #64

Open ceteri opened 4 years ago

ceteri commented 4 years ago

We need means to analyze the "quality" for the more popular journal article publishers. In other words, we need a classifier based on the publisher (ScienceDirect, PubMed, OUP, etc.) for how likely the entries in publication partitions will "survive" all the way through our workflow to successful PDF parsing.

Methodology:

Delivery:

  1. results are best visualized and packaged as a Jupyter notebook here in the https://github.com/Coleridge-Initiative/RCGraph repo
    1. later we'll move the analysis into an additional workflow step.