Hack-y code to fix the (incremental) training set after the sentences were accidentally segmented for step 2 when we annotated for the extra 8 categories: GPE, ORG PN, PERSON PN, POSTCODE, EMAIL, PHONE N, DATE, MONEY £
This requires a hack to ensure we could merge these annotations with the original set annotated for FORM.
This code addresses this. And also de-duplicated the set.
Checklists
This pull/merge request meets the following requirements:
Summary
Hack-y code to fix the (incremental) training set after the sentences were accidentally segmented for step 2 when we annotated for the extra 8 categories: GPE, ORG PN, PERSON PN, POSTCODE, EMAIL, PHONE N, DATE, MONEY £
This requires a hack to ensure we could merge these annotations with the original set annotated for FORM.
This code addresses this. And also de-duplicated the set.
Checklists
This pull/merge request meets the following requirements:
docs/aqa/aqa_plan.md
)docs/aqa/data_log.md
), if necessarydocs/aqa/assumptions_caveats.md
), if necessarydocs
folderComments have been added below around the incomplete checks.