nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.
https://petal-labeller.readthedocs.io/en/latest/
The Unlicense
6 stars 4 forks source link

CORE dataset vs. Semantic Scholar #17

Open bruffridge opened 3 years ago

bruffridge commented 3 years ago

Which dataset should we work with first? Things to consider:

  1. (Most important) Which dataset contains more biomimicry articles? (Alex and Colleen)
  2. Which dataset is easier to work with? (download, extract, clean, dedup, etc.) (dinopanda)
  3. Which dataset is easier to pull down new articles that we haven't labelled yet. (dinopanda)
  4. Which has better data quality? (Shruti)
  5. Which contains the most data elements important to us? (full text url, title, abstract, fulltext, others?) (Shruti)

Semantic Scholar vs. CORE

bruffridge commented 3 years ago
  1. Semantic Scholar
  2. Wash
    • dinopanda