nlpsandbox / nlpsandbox-schemas

OpenAPI specifications of the NLP Sandbox services
https://nlpsandbox.io
Apache License 2.0
2 stars 4 forks source link

Generalize AddressType to SubType #125

Closed boyleconnor closed 3 years ago

boyleconnor commented 3 years ago

What if we got rid of the addressType property for textPhyiscalAddressAnnotation and put a subType property into the textAnnotation schema.

Essentially all of our annotation types have intuitive "sub-types". Dates have YEAR, MONTH, DAY; person names have GIVEN NAME, FAMILY NAME, MIDDLE NAME (or maybe those times DOCTOR and PATIENT ?)

These could be useful for De-IDing:

"Mary Williamson went to Seattle on the 12th" -> "[GIVEN_NAME] [LAST_NAME] went to [CITY] on the [DATE]"

would be more readable than "[NAME] [NAME] went to [ADDRESS] on the [DATE]".

tschaffter commented 3 years ago

Dates have YEAR, MONTH, DAY

Dates do not have sub-types. Eample: 12/31/2020

person names have GIVEN NAME, FAMILY NAME, MIDDLE NAME (or maybe those times DOCTOR and PATIENT ?)

Having a subType for a patient would be confusing. The term sub-type is not well descriptive.

While deidentification is the primary application that we have in mind for annotator, we should attempt to keep them as agnostic as possible of a specific application.

github-actions[bot] commented 2 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.