kentonl / e2e-coref

End-to-end Neural Coreference Resolution
Apache License 2.0
518 stars 174 forks source link

What does the cluster in the json file mean? #59

Open MENGHAH opened 5 years ago

MENGHAH commented 5 years ago

I have got the jsonlines files through the setup_training.sh. But I can't understand the meaning of clusters in the json files. Can you explain it to me?

rainyrainyguo commented 4 years ago

I have the same question

oapandit commented 4 years ago

Hello,

I was having the same question. It took some time to understand that.

Its the the coreference cluster present in the actual gold files. The key "clusters" in jsonlines file contains list of clusters present in the original file and each cluster contains list of mentions. The mentions are represented by its start and end index from the original file.

For example, in test.jsonlines - first entry is for file "bc/cctv/00/cctv_0005_0". Clusters are - [[[57, 59], [25, 27], [42, 44]], [[19, 23], [16, 16]], [[83, 83], [82, 82]]].

Here, mentions [19, 23], [16, 16] are in same cluster. Also there are other two clusters as [57, 59], [25, 27], [42, 44] and [83, 83], [82, 82]. Mention -

'the', 'Chinese', 'securities', 'regulatory', 'department'

is represented as its start and end index [19,23] . And so on for other mentions.

I hope this helps for other people as well.

thanks, Onkar

liyaoshigehaoren commented 4 years ago

@MENGHAH Could you please send this document for me?train.jsonlines,test.jsonlines,dev.jsonlines