JuliaGast / TGB2

Temporal Graph Benchmark project repo
1 stars 0 forks source link

Node and Edge features #5

Closed JuliaGast closed 2 months ago

JuliaGast commented 6 months ago
JuliaGast commented 6 months ago
Event ID    Event Date  Event Type  Event Mode  Intensity   Quad Code   Contexts    Actor Name  Actor Country   Actor COW   Primary Actor Sector    Actor Sectors   Actor Title Actor Name Raw  Wikipedia Actor ID  Recipient Name  Recipient Country   Recipient COW   Primary Recipient Sector    Recipient Sectors   Recipient Title Recipient Name Raw  Wikipedia Recipient ID  Placename   City    District    Province    Country Latitude    Longitude   GeoNames ID Raw Placename   Feature Type    Source  Publication Date    Story People    Story Organizations Story Locations Language    Version

20180401-8302-65b2c3a686c6_SUPPORT  2018-04-01  SUPPORT NA  5   VERBAL COOPERATION  rights_freedoms | territory Ram Madhav  India   750 GOV GOV | MIL   secretary of defense    Ram Madhav  Ram Madhav  None; Dalai Lama    None; None  None; None  None; REL   None; REL   None; buddhist missionary   followers; Dalai Lama   None; Dalai Lama    Republic of India   None    None    None    IND 22  79  1269750 India; Tibet    PCLI    Hindustan Times 2018-04-01  Ram Madhav | Jawaharlal Nehru | Narendra Modi | Madhav | the Dalai Lama | the Dalai Lama 's | His Holiness  BJP | the Central Tibetan Administration | Bharatiya Janata Party ( Republic of India | Tibet Autonomous Region English NGEC_coder-Vers001-b1-Run-001
JuliaGast commented 5 months ago

So far: [Actor Name, event type, Recipient Name, Event Time] %event type as relation type, Actor Name as head, Recipient Name as tails and Event Date as time

With this structure, I think the only features we can have as node features are the entries describing the actors/recipients, i.e.: Primary Actor Sector, Primary Recipient Sector, Recipient Country (maybe in Latitude and Longitude)

Other entries that I thought would be interesting are: Intensity; Sector

But: These are different for each quadruple; i.e. the same actor, eventtype, recipient can have different intensity on different days, right? Thus I think it would be overly complicated to add them;

@shenyangHuang what do you think?

shenyangHuang commented 5 months ago

I am not sure if the intensity makes sense. One tricky thing with features is that some of the nodes might not have features, this is to be verified. If so, these can have a zero vector to denote the missing features.

JuliaGast commented 5 months ago

Todo @JuliaGast

shenyangHuang commented 2 months ago

old issue, resolved