fhamborg / Giveme5W1H

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Apache License 2.0
505 stars 87 forks source link

"newsCluster" item in json files in Giveme5W1H/examples/datasets/40er/data #46

Closed googlx closed 4 years ago

googlx commented 4 years ago

Thanks for your great work!

Recently, I'm working on a project to cluster news by the event. The item named "newCluster" in your dataset looks good for this purpose.

Is it extracted automatically? If so, would you mind provide us more details about how to get it?

"newsCluster": {
    "CategoryId": 1,
    "Category": "world",
    "TopicId": 2,
    "Topic": "legancy",
    "EventId": 49,
    "Event": "las_vegas_shooting",
    "Url": "http://usa.chinadaily.com.cn/world/2017-10/03/content_32788252.htm"
  }
fhamborg commented 4 years ago

Thank you! Re your question, I'm sorry, but unfortunately nothing much: I do recall that we added this data for a news clustering project, but that's long ago.