Closed AmyOlex closed 6 years ago
Egoitz said the Span should include the ordinal characters, so "7th" is the correct raw token, BUT, the value must be "7". In looking back the only gold standard file with this issue is ID011_clinic_031. I'm emailing to let him know.
The Gold standard files are inconsistent with including the ordinal characters in the span for DayOfMonth entities. For example, in file ID011_clinic_031 in the phrase "seen on October 7th." the gold standard only identifies the "7" where Chrono returns "7th". However, in other files the Gold standard returns the full "7th". For example, in file ID051_clinic_148 for the phrase "until March 8 or 9th" returns "9th" and not "9". I will be emailing Egoitz on this to figure out which is correct and which is not. When I made this change in the code an ran it on the testing files we ended up doing worse because we were not returning the full ordinal value as the day.