AmyOlex / Chrono

Parsing time normalizations from text.
GNU General Public License v3.0
15 stars 4 forks source link

formatted date methods need to utilize .group(0) when returning parsed text. #33

Closed AmyOlex closed 6 years ago

AmyOlex commented 6 years ago

The hasYear() method brought this to my attention. Some of the methods that parse formatted dates are returning the original text. This includes any punctuation at the end, which messes up the integer conversion later on. These methods need to utilize the .group(0) in the code when returning text so that the punctuation is not included.

maffeyl commented 6 years ago

Reworked HasYear which fixed some other issues with it. Haven't merged changes yet, but it improved. Still working on getting rid of double year entities.

I got it to work by adding the following lines at line 641:

table = str.maketrans(dict.fromkeys(string.punctuation))

chrono_year_entity = chrono.ChronoYearEntity(entityID=str(chrono_id) + "entity", start_span=abs_StartSpan, end_span=abs_EndSpan, value=int(text.translate(table)))

maffeyl commented 6 years ago

Merged changes, will create new issue for the duplicate entities.