clamsproject / aapb-annotations

Repository to store manual annotation dataset developed for CLAMS-AAPB collaboration
3 stars 0 forks source link

Official name and project dir for "rfb" project #57

Closed jarumihooi closed 9 months ago

jarumihooi commented 9 months ago

New Feature Summary

(For the record, the current used name is "role-filler-binding/rfb" annotation project. It has also been called "OCR" or "creditparsing" or "kvparsing", or something related to the word "structure".)

Related

No response

Alternatives

No response

Additional context

No response

jarumihooi commented 9 months ago

Additional Context

==== Impact: The name of this project is parallel to efforts to better describe what the goal of this project is. With the goal of the project unclear, it is unclear how to make annotation-guideline decisions on differentiation. This Slack thread details some difficulties during guideline creation. The current guidelines.ppt is here, and the current readme.md is here.

One of the main questions is which are positive cases (we want to annotate), vs negative cases (we don't care about this). To know this, we need to know what it is that we are trying to extract -> even if the scope begins as limited in this batch and increases in other later batches.

For instance, is the concept to extract all the information as related to useful metadata even when it doesn't occur as a role-filler key-value type pair... (eg. Copyright information, "Produced in collaboration with Winnifred Touhey Studio and Studio Ghibli" & "Special Thanks To: John Singleton and Studio Ghibli") Or is the goal to only annotate and possibly reform information into role-filler pairs when needed?

@owencking We may bring this up in a meeting either before or on 10/11 Wed if we are able to find enough cases where both Annotation team and Dev team don't have answers.

jarumihooi commented 9 months ago

During verbal meeting today, the "Role Filler Binding" name was discussed and chosen.

Needs wanted by GBH for metadata extraction were clarified.

Difficulties of the rfb pair data structure were discussed, however, were found sufficient for the task at hand for now, noting that there are some instances still there it may be difficult to match text phenomenon to this pairing scheme.