CogComp / cogcomp-nlp

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
http://nlp.cogcomp.org/
Other
470 stars 142 forks source link

Issue 665 #687

Open ChaseDuncan opened 6 years ago

ChaseDuncan commented 6 years ago

Closes #665

mssammon commented 6 years ago

@mayhewsw , there were ways to add constituents that sidestepped the overlap check. This issue is intended to fix the problem.

mayhewsw commented 6 years ago

Ah, I see it now. Yes, looks good.

cogcomp-dev commented 6 years ago

@ChaseDuncan one test is failing in CI (teamcity build): please check into it and comment here if it is not related to the changes you made.

danyaljj commented 6 years ago

TeamCity is failing: http://morgoth.cs.illinois.edu:5800/viewLog.html?buildId=762&buildTypeId=CogcompNlp_Build&tab=buildLog&filter=err

ChaseDuncan commented 6 years ago

It looks like there are more places where this was being done unintentionally. Will look into it...

mssammon commented 6 years ago

@ChaseDuncan looks like CI build is failing -- please take a look...

ChaseDuncan commented 6 years ago

@mssammon I think the problem is a conflict in the design of mention detection and the some of the other CCG software like BasicAnnotatorService.

For example:

By default MD appears to create overlapping spans, e.g. [NAM-ORG the CDC ] [NOM-PER the CDC 's Dr. ] [NAM-PER the CDC 's Dr. Michael Jhung ] .

For some (probably good) reason I don't understand, there's some View copying happening in BasicAnnotatorService. By default it creates Views which do not allow overlapping spans. So when it tries to copy the constituents from the MENTION View into a new View instance, it dies. Previously this was done using addConstituent(Constituent) so the allowOverlappingSpans check was circumvented.

It's not immediately clear to me how best to address this since the problem involves a bunch of complex system for which I have basically no knowledge. But, if no one else knows what to do, I can look into the code and see what I can figure out.

mssammon commented 6 years ago

I think it is reasonable to change any code that violates the assumption by changing it to explicitly allow overlapping constituents -- since that is going on anyway. For posterity, you could create a list of any such places you change, and an issue that notifies everyone these were changed -- that would ensure (in theory) that someone checks each one. If you want to make this your issue for the next k weeks, please go ahead and then ask anyone who seems likely to have a clue for each given component. @Slash0BZ and @danyaljj are likely candidates for who to ask w.r.t. the Mention view.

Slash0BZ commented 6 years ago

Ah I didn't know that SpanLabelView by design discourages overlapped mentions. In MD, complete mention spans can overlap with each other, but mention heads cannot. I think we either need a special case for SpanLabelView that allows overlapped mentions, or I can try to only add mention heads to the view, and use separate function to generate the complete mention using information stored in the head.

mssammon commented 6 years ago

@Slash0BZ I think the mention view should allow overlap, and that you should provide a utility method in the relevant code that verifies no constraints are violated.