HeidelTime / heideltime

A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.
GNU General Public License v3.0
343 stars 67 forks source link

Documentation for normalization fucntion #67

Open parisni opened 7 years ago

parisni commented 7 years ago

Hi,

I haven't found out where the normMonth, centuryGroup and so on are defined.

Somone can please point me to the documentation or at least java code behind ?

Thanks

JannikStroetgen commented 7 years ago

I do not fully understand what you mean and what you do not understand, but did you have a look here:

https://github.com/HeidelTime/heideltime/wiki/Developing-Resources

In case it does not help, please provide some more details what issue(s) you are facing.

Cheers, Jannik

parisni commented 7 years ago

Hey

for example line 29: https://github.com/HeidelTime/heideltime/blob/master/resources/english/rules/resources_rules_daterules.txt has a centurygroup() function inside the normalisation part. Where are all of those function listed, with their behavior ? And where are they defined in the heideltime java code ?

Thanks !

kno10 commented 7 years ago

There is no "centurygroup" function.

https://github.com/HeidelTime/heideltime/blob/master/src/de/unihd/dbs/uima/annotator/heideltime/HeidelTime.java#L2322

will match "group(6)" in "centurygroup(6)", keep the "century" part unchanged, and replace only the group part, with the regexp capture group 6, which here is %reYear2Digit. A later stage will then add century information to dates such as "UNDEF-century42": https://github.com/HeidelTime/heideltime/blob/master/src/de/unihd/dbs/uima/annotator/heideltime/HeidelTime.java#L825 This is all happening in string and regexp processing.

parisni commented 6 years ago

@kno10 thanks a lot for that clear explanation.