davidberenstein1957 / crosslingual-coreference

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.
MIT License
103 stars 17 forks source link

How do we character ranges of the clusters #17

Closed sudarshansivakumar closed 1 year ago

sudarshansivakumar commented 1 year ago

Right now, when we call predictor.predict() we get the clusters as a list of lists, and the cluster heads along with their token indices. Is it possible to :

davidberenstein1957 commented 1 year ago

That would be possible. @Masboes, this would be a good first issue to pick up.

shmouelsamares commented 1 year ago

in spacy you could use the token.idx property to get the token's first character index. Then token.idx + len(token) to get the last character index. Is it useful ?

davidberenstein1957 commented 1 year ago

@sudarshansivakumar @shmouelsamares you can see a working example here.