CogComp / cogcomp-nlpy

CogComp's light-weight Python NLP annotators
http://nlp.cogcomp.org/
Other
116 stars 26 forks source link

add functionality to get overlapping constituents #53

Closed danyaljj closed 7 years ago

danyaljj commented 7 years ago

Suppose you choose a constituent from dependency view. How can you get its lemma? We have to implement similar functionalities like the ones we have in our java library. https://github.com/CogComp/cogcomp-nlp/blob/35dea894ea9b02d1158dafec53c941c4e40b7547/core-utilities/src/main/java/edu/illinois/cs/cogcomp/core/datastructures/textannotation/View.java#L400

GHLgh commented 7 years ago

User can do things similar to the functionality your posted:

# suppose user has already obtained the following 
dep_cons # a constituent from dependency view
lemma_view

# then user can find a list of lemma that overlap with dep_cons in terms of token position:
lemma_overlapping_span = []
for lemma in lemma_view:
    if((lemma['start'] <= dep_cons['start'] and lemma['end'] >= dep_cons['start']) or
                    (lemma['start'] <= dep_cons['end'] and lemma['end'] >= dep_cons['end']) or
                    (lemma['start'] >= dep_cons['start'] and lemma['end'] <= dep_cons['end']) or
                    (lemma['start'] <= dep_cons['start'] and lemma['end'] >= dep_cons['end'])):
        lemma_overlapping_span.append(lemma)

print(lemma_overlapping_span)
danyaljj commented 7 years ago

Yeah we should probably generalize it and make it a functions in the view.

GHLgh commented 7 years ago

Okay, following the same function declaration as the one you mentioned? (ex. some_view.get_overlapping_constituents(start_token_index, end_token_index))

danyaljj commented 7 years ago

Yup