cnorthwood / ternip

Temporal Expression Recognition and Normalisation in Python
Other
78 stars 17 forks source link

DCT detection from filename #7

Open leondz opened 13 years ago

leondz commented 13 years ago

It's possible to extract DCT (at day granularity) from filenames - is this attemped?

From TimeBank:

VOA19980331.1700.1533.tml WARNING: Could not determine document creation time, use -c to override

cnorthwood commented 13 years ago

No, it's not attempted. Could be useful though.

leondz commented 13 years ago

I've written a small module for managing DCT detection, works flawlessly on the ~240 docs in TimeBank + ATC, as well at the 1.8mil in the TAC KBP source collection (save one file which really has no explicit DCT information, and very little inferable either). I'll work on integrating this and setting a fallback "guess dct from filename / doc content" option if no value for -c is specified and no other information is available.