izaskr / chart_descriptions

0 stars 1 forks source link

[Inter-annotator alignment] Labeling pronouns and other anaphoric expressions #45

Open izaskr opened 4 years ago

izaskr commented 4 years ago

Often the bars are referred to with pronouns, but labeling the pronouns in the same way as full names of bars would demand some more manual effort. We hope the language model will learn to generate anaphoric expressions, for example

Next is Spain <x_axis_label_highest_value>. It has a gender pay gap of 15 %. "It" is left unlabeled.

izaskr commented 4 years ago

Update: we decided to be more precise and run an off-the-shelf coreference resolution tool. This can help us keep track of entities and their respective labels.

Two tools (both pre-trained neural systems) have been experimented with briefly:

Both are relatively easy to use, but based on a handful of examples from out dataset, they differ in performance.

Example sentence:

"This chart shows that the majority of people in Zarqa prefer reading a book in an evening. Although this would suggest they may prefer solitude"

Output from tools:

Another example:

In Europe spending was £270 million, in Asia it was £180 million and Africa was the highest with £290 million.

Conclusion: based on this, I suggest using SpanBERT to find coreferences and label the cases, where the head reference has label.

izaskr commented 4 years ago

Examples to discuss:

izaskr commented 4 years ago

The entire data (all summaries) have now been processed for coreferences. Some questions to discuss:

Some more examples talk about: