gwf-uwaterloo / waterwheel

WATERloo Water and Hydrologic Entity Extractor and Linker
Apache License 2.0
1 stars 3 forks source link

Handle conjunctions #14

Open lintool opened 4 years ago

lintool commented 4 years ago

We need to be able to handle conjunctions like "the Mississippi and Missouri Rivers".

lintool commented 4 years ago

And lakes also, e.g.,

When a second fault line, the Saint Lawrence rift, formed approximately 570 million years ago,[15] the basis for Lakes Ontario and Erie were created, along with what would become the Saint Lawrence River

From https://en.wikipedia.org/wiki/Great_Lakes

Govind9 commented 4 years ago

Even with the current set of rules, in the sentence "The Mississippi and Missouri Rivers", both the rivers will be identified although separately. Basically conjunctions like these won't go undetected. Example image

Is this acceptable or do we want the whole conjunction to recognized as one entity?

lintool commented 4 years ago

hi @Govind9 I'd like to play the system myself once you have it properly refactored. I also want to consider these cases in the context of unit cases.