microsoft / Recognizers-Text

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV). Packages available at: https://www.nuget.org/profiles/Recognizers.Text, https://www.npmjs.com/~recognizers.text
MIT License
1.67k stars 429 forks source link

[ES DateTimeV2] Incorrect extraction of entities in Spanish input, over aggressive match boundaries #2116

Closed HanLiMS closed 4 years ago

HanLiMS commented 4 years ago

Issue Spanish model returns wrong Date Time

Sample Input "El gobernador de California, Gavin Newsom, describió el martes el plan de reapertura por etapas para su estado, con negocios minoristas y escuelas a \"semanas de distancia\" en base a una aparente estabilización tanto en el número de casos confirmados como en muertes por coronavirus."

Response: Detected "martes el plan de reapertura por etapas para su estado, con negocios minoristas y escuelas a \"semanas de distancia\" en base a una " as "DateTime" This should not be extracted as Datetime entity.

tellarin commented 4 years ago

Multiple issues, but root cause seems to be linked to an incorrect extraction of "una" as a time.