The Evita component that imports existing code has a problem with things like hypodense lesion within the caudate lobe and imports that event twice. The reason is that the event is spread out over two chunks:
[hypodense lesion]ng within [the caudate lobe]ng
Evita considers both 'lesion' and 'lobe' as events and in both cases it finds the imported event and installs it on the chunk. Should limit event import only to those cases where the head of the imported event falls within the chunk Evita is looking at. This will miss cases however where the head of the imported event does not appear in a chunk.
Another option is to do a post screening of all events added and remove duplicates.
Incidentally, this issue also causes problems for the alignment code in testing/evaulate.py, resulting in the following alignment
hypodense lesion within the caudate lobe - hypodense lesion within the caudate lobe
None - hypodense lesion within the caudate lobe
and then counting the second alignment as a false positive.
This is mostly solved after some additions to the chunker, but some of the above suggestions and other changes may be worthwhile:
[ ] do a post screening of all events added and remove duplicates
[ ] add imported events even if you cannot find a place for them in the chunks, but make sure that later when events are imported into the TarsqiTree they will be ignored.
The Evita component that imports existing code has a problem with things like
hypodense lesion within the caudate lobe
and imports that event twice. The reason is that the event is spread out over two chunks:[hypodense lesion]ng within [the caudate lobe]ng
Evita considers both 'lesion' and 'lobe' as events and in both cases it finds the imported event and installs it on the chunk. Should limit event import only to those cases where the head of the imported event falls within the chunk Evita is looking at. This will miss cases however where the head of the imported event does not appear in a chunk.
Another option is to do a post screening of all events added and remove duplicates.
Incidentally, this issue also causes problems for the alignment code in
testing/evaulate.py
, resulting in the following alignmentand then counting the second alignment as a false positive.