Closed jarumihooi closed 9 months ago
==== Impact: The name of this project is parallel to efforts to better describe what the goal of this project is. With the goal of the project unclear, it is unclear how to make annotation-guideline decisions on differentiation. This Slack thread details some difficulties during guideline creation. The current guidelines.ppt is here, and the current readme.md is here.
One of the main questions is which are positive cases (we want to annotate), vs negative cases (we don't care about this). To know this, we need to know what it is that we are trying to extract -> even if the scope begins as limited in this batch and increases in other later batches.
For instance, is the concept to extract all the information as related to useful metadata even when it doesn't occur as a role-filler key-value type pair... (eg. Copyright information, "Produced in collaboration with Winnifred Touhey Studio and Studio Ghibli" & "Special Thanks To: John Singleton and Studio Ghibli") Or is the goal to only annotate and possibly reform information into role-filler pairs when needed?
@owencking We may bring this up in a meeting either before or on 10/11 Wed if we are able to find enough cases where both Annotation team and Dev team don't have answers.
During verbal meeting today, the "Role Filler Binding" name was discussed and chosen.
Needs wanted by GBH for metadata extraction were clarified.
Difficulties of the rfb pair data structure were discussed, however, were found sufficient for the task at hand for now, noting that there are some instances still there it may be difficult to match text phenomenon to this pairing scheme.
New Feature Summary
(For the record, the current used name is "role-filler-binding/rfb" annotation project. It has also been called "OCR" or "creditparsing" or "kvparsing", or something related to the word "structure".)
Related
No response
Alternatives
No response
Additional context
No response