wkiri / MTE

Mars Target Encyclopedia
Apache License 2.0
5 stars 0 forks source link

MTE processing of documents that mention multiple missions #22

Closed wkiri closed 2 years ago

wkiri commented 2 years ago

The MTE consists of a separate database per mission. However, some documents mention targets from multiple missions. How should we accommodate these documents?

  1. Independently apply each mission's NER model + shared jSRE model. However, the review process will be tedious because each document has to be reviewed #missions times.
  2. Train a merged Target NER model across all missions. How then to separate them for each mission's database?
  3. Train a merged model to distinguish between targets per mission (different entity types for Target-MPF, Target-PHX, etc.). I suspect this will not have high accuracy and would require a lot of manual editing to reassign targets to their correct mission. However, it is worth evaluating.

To help decide the best course of action, we would like to evaluate each option on the same test set. This test set should be composed of some documents from each of the labeled sets (MSL, MPF, PHX).

Let's use this issue to continue planning how to proceed.

wkiri commented 2 years ago

For now, for simplicity, we will process documents independently for each mission (option 1). In most cases, only a single mission's targets are mentioned.