Closed marcverhagen closed 7 years ago
Probably the best way to deal with this is to update the chunking given the results of GUTime. For example, the phrase "for the fourth quarter ended Aug. 26." (wsj_0263) is chunked as follows:
[for the fourth quarter ended Aug.]NG 26.
So when GUTime finds Aug. 26
it does enter it into the TagRepository, but it is not added into the TarsqiTree when later components apply, and it can therefore not be linked. If we could adjust the chunking, we would actually end up with one of the following TarsqiTree fragments
[for the [fourth quarter]timex3 ended [Aug. 26]timex3]NG.
for the [[fourth quarter]timex3]NG ended [[Aug. 26]timex3]NG.
The second one is probably the more useful one.
Closed this, but added a new issue specific to the chunking in #76.
The process for inserting tags from GUTime is not very smart. It uses the
tree.Node.insert()
method which is not really intended for doing this and cannot deal with new TIMEX3 tags that do not neatly agree with chunks from the preprocessor.First make sure that insert does something more helpful than just printing a generic warning when it fails to insert a tag, then figure out a way to deal with more cases.