HeidelTime / heideltime

A multilingual, cross-domain temporal tagger developed at the Database Systems Research Group at Heidelberg University.
GNU General Public License v3.0
343 stars 67 forks source link

German compounds consisting of weekday + time of day not extracted #25

Closed jzell closed 9 years ago

jzell commented 9 years ago
Hi,

running HeidelTime on news texts, I encountered a type of temporal expression that
is currently not recognized: according to the German spelling reform, combinations
of weekday (e.g., 'Montag') and time of day (e.g., 'abend') are connected to one word.
This holds for substantives and adverbs, for instance: http://www.duden.de/rechtschreibung/Montagmorgen

HeidelTime (tested version: 1.8) currently doesn't extract these temporal expressions.
In the following sentence, only 'Mittwoch' is extracted and correctly normalized, all
other temporal expressions are neglected:

"Am Montagabend hat Peter telefoniert. Am Dienstagabend auch. Am Mittwoch auch. Montagmorgens
wird er ebenfalls telefonieren."

Attached you can find the entered command and HeidelTime's output.

Maybe you can find time to add this feature at some point :)

Original issue reported on code.google.com by boegel.thomas on 2015-01-13 09:17:33


jzell commented 9 years ago
We opened a new branch for improving German resources and added a rule for such compounds
in this branch. Before the next release, we will merge this branch with the default
branch.

Original issue reported on code.google.com by jannik.stroetgen on 2015-01-13 14:10:09

jzell commented 9 years ago
rcec1d9a91325, r3f6d2d64ca93, rf6a43f938455 address this sample and lead to correct
extraction.

Original issue reported on code.google.com by zell@informatik.uni-heidelberg.de on 2015-01-13 15:57:23

jzell commented 9 years ago

imported issue, closing.