kermitt2 / kish

Keeping It Simple is Hard
9 stars 5 forks source link

Options for task creation (methods question) #6

Open jameshowison opened 1 year ago

jameshowison commented 1 year ago

Just brainstorming some options for task creation. There is certainly relevant literature on this that should be explored (e.g., in reinforcement learning)

I think there are two conditions to vary, each with multiple possible settings.

  1. System extractions:
    • None
    • Full (model tuned for f-score)
    • Model tuned for recall (expect high false positives)
    • Include random additional annotations (seeded false positives, do annotators find them?)
  2. Negative example coding
    • None (only reviewing true positives/false positives, which become negative examples.)
    • Windowed
    • nearby sentences
    • pages
    • page window based on empirical occurrence in gold standard training set
    • section (as provided by grobid)
      • could also sub-sample from section