Open chainsawriot opened 11 months ago
Use (UPDATE SHOULD BE INCORRECT, e.g. igraph
to calculate the syntactic distance.distances(graph, mode = "all")[, "ROOT"]
)
require(textplot)
#> Loading required package: textplot
require(igraph)
#> Loading required package: igraph
#>
#> Attaching package: 'igraph'
#> The following objects are masked from 'package:stats':
#>
#> decompose, spectrum
#> The following object is masked from 'package:base':
#>
#> union
##m_eng_ewt <- udpipe_download_model(language = "english-ewt", "~/dev/misc")
## Change this
m_eng_ewt_path <- "~/dev/misc/english-ewt-ud-2.5-191206.udpipe"
m_eng_ewt_loaded <- udpipe::udpipe_load_model(file = m_eng_ewt_path)
sentence <- udpipe::udpipe_annotate(m_eng_ewt_loaded, x = "Turkish President Tayyip Erdogan, in his strongest comments yet on the Gaza conflict, said on Wednesday the Palestinian militant group Hamas was not a terrorist organisation but a liberation group fighting to protect Palestinian lands.") |> as.data.frame()
textplot::textplot_dependencyparser(sentence)
#> Loading required namespace: ggraph
sentence[,c("token_id", "head_token_id", "token", "dep_rel")]
#> token_id head_token_id token dep_rel
#> 1 1 2 Turkish amod
#> 2 2 16 President nsubj
#> 3 3 2 Tayyip flat
#> 4 4 2 Erdogan flat
#> 5 5 2 , punct
#> 6 6 9 in case
#> 7 7 9 his nmod:poss
#> 8 8 9 strongest amod
#> 9 9 2 comments nmod
#> 10 10 9 yet advmod
#> 11 11 14 on case
#> 12 12 14 the det
#> 13 13 14 Gaza compound
#> 14 14 9 conflict nmod
#> 15 15 16 , punct
#> 16 16 0 said root
#> 17 17 18 on case
#> 18 18 16 Wednesday obl
#> 19 19 22 the det
#> 20 20 22 Palestinian amod
#> 21 21 22 militant amod
#> 22 22 28 group nsubj
#> 23 23 22 Hamas appos
#> 24 24 28 was cop
#> 25 25 28 not advmod
#> 26 26 28 a det
#> 27 27 28 terrorist compound
#> 28 28 18 organisation flat
#> 29 29 32 but cc
#> 30 30 32 a det
#> 31 31 32 liberation compound
#> 32 32 18 group conj
#> 33 33 32 fighting acl
#> 34 34 35 to mark
#> 35 35 33 protect xcomp
#> 36 36 37 Palestinian amod
#> 37 37 35 lands obj
#> 38 38 16 . punct
graph <- graph_from_data_frame(sentence[,c("head_token_id", "token_id")])
V(graph)$name <- c("ROOT", sentence$token)
distances(graph, mode = "all")[, "terrorist"]
#> ROOT Turkish President Tayyip Erdogan ,
#> 5 4 6 7 5 3
#> in his strongest comments yet on
#> 1 2 4 6 5 7
#> the Gaza conflict , said on
#> 6 6 6 6 7 7
#> Wednesday the Palestinian militant group Hamas
#> 7 7 8 8 8 5
#> was not a terrorist organisation but
#> 4 2 2 0 2 3
#> a liberation group fighting to protect
#> 3 3 3 5 5 5
#> Palestinian lands .
#> 7 8 5
Created on 2023-11-22 with reprex v2.0.2
Using the term from https://arxiv.org/pdf/1909.10171.pdf maybe it should be called dependency proximity.
Created on 2023-11-22 with reprex v2.0.2