leifeld / dna

Discourse Network Analyzer (DNA)
127 stars 41 forks source link

Question about 'xergm' package in rDNA (R 3.6.3) #197

Closed phbre closed 4 years ago

phbre commented 4 years ago

Hi everyone,

I hope you are doing well.

After many hours of research and attempts to fix it, I couldn't find a satisfying solution for my problem with the xergm package in rDNA (R Version 3.6.3.).

For the visualization of the Boolean variable agreement/disagreement, I cannot download the 'xergm' package for R 3.6.3 and were, therefore, not able to analyze the results for this variable.

All I came up with in R is the error code: Warning in install.packages : package ‘xergm’ is not available (for R version 3.6.3)

Thank you for your help.

Best regards Philipp Breer

leifeld commented 4 years ago

Where did you read or hear that xergm is required to visualise discourse networks? The package does not offer any visualisation functions.

phbre commented 4 years ago

I could not visualize the agreement/disagreement variable in rdna and the only package that failed to load was xergm, so I jumped to the conclusion it must have something to do with that package.

leifeld commented 4 years ago

The DESCRIPTION file (https://github.com/leifeld/dna/blob/master/rDNA/DESCRIPTION) does not include xergm as a dependency. Did you read somewhere that the package should be installed for that purpose?

phbre commented 4 years ago

Not for the purpose of visualisation, but as a recommended installation in your publication: Discourse Network Analyzer Manual, with Mr Johannes Gruber and Felix Rolf Bossner (last updated September 11, 2019). But the purpose is, as you write "analyzing and clustering" (p. 80) and the visualization was, again, just my wrong conclusion as I come to believe.

leifeld commented 4 years ago

Okay, let's forget about the xergm part and focus on the problem of how to visualise agreement and disagreement in two-mode networks with different colours. I remember discussing this with @JBGruber last year or so. He wrote the network visualisation function. There was no easy way to use different colours for the different kinds of ties (like green, red, and blue for positive, negative, and ambivalent, for example). I am not sure if @JBGruber has found a solution for this, so I'll defer to him for now.

phbre commented 4 years ago

Alright, thank you for your help!

JBGruber commented 4 years ago

Unfortunately, I haven’t managed to implement this yet, but colouring the edges is absolutely possible using the infrastructure in igraph and ggraph, which power the dna_plotNetwork under the hood. So what I can do now is show how you can write the code yourself, and hopefully some day it will work automatically in rDNA.

I use the sample data as in the manual to make a twomode network:

library(rDNA)
library(tidygraph)
library(ggraph)
library(dplyr)

dna_init()
conn <- dna_sample() %>% 
  dna_connection()

# first construct the network as usual and retrieve attribute for variable 1 and 2
nw <- dna_network(conn,
                  networkType = "twomode",
                  variable1 = "organization",
                  variable2 = "concept",
                  excludeValues = list("concept" =
                                         c("There should be legislation to regulate emissions.")))

# also get attributes for both variables (optional)
att1 <- dna_getAttributes(conn, variable = "organization") %>% 
  rename(var_type = type)
att2 <- dna_getAttributes(conn, variable = "concept") %>% 
  rename(var_type = type)

To get this data to work with the excellent ggraph, we need to convert it to an igraph object. I personally found working with igraph to be quite unintuitive, so I will use tidygraph instead which implements a few “tidy” functions (especially from dplyr) for igraph objects.

graph <- dna_toIgraph(nw) %>%                                 # convert to igraph
  as_tbl_graph() %>%                                          # convert to tidygraph
  activate(nodes) %>%                                                               
  left_join(rbind(att1, att2), by = c("name" = "value")) %>%  # add attribute data to nodes
  activate(edges) %>%
  mutate(weight = case_when(                                  # recode edge weights
    weight == 1 ~ "positive",
    weight == 2 ~ "negative",
    weight == 3 ~ "mixed"
  ))

In the first two lines, the network is converted to an igraph object and then to a tidygraph tibble which is essentially the same but with a nicer printing method (similar to the difference between tibble and data.frame objects).

Every igraph object consists of two tables: one that contains information about the nodes and one that contains information about the edges of a graph. Using activate, you tell tidygraph which table to work on. I add the two attribute objects created earlier to the graph and then activate the edges table to recode the weights into more descriptive labels.

You can inspect the two tables inside the graph using activate and then as_tibble:

graph %>% 
  activate(nodes) %>% 
  as_tibble()
## # A tibble: 12 x 8
##    type  name                          id color  var_type  alias notes frequency
##    <lgl> <chr>                      <int> <chr>  <chr>     <chr> <chr>     <int>
##  1 FALSE Alliance to Save Energy       16 #00CC… "NGO"     ""    ""            4
##  2 FALSE Energy and Environmental …     7 #FF99… "Busines… ""    ""            4
##  3 FALSE Environmental Protection …    14 #0000… "Governm… ""    ""           11
##  4 FALSE National Petrochemical & …    25 #FF99… "Busines… ""    ""            4
##  5 FALSE Senate                        11 #0000… "Governm… ""    ""            5
##  6 FALSE Sierra Club                   19 #00CC… "NGO"     ""    ""            6
##  7 FALSE U.S. Public Interest Rese…    22 #00CC… "NGO"     ""    ""            6
##  8 TRUE  CO2 legislation will not …     8 #0000… ""        ""    ""           10
##  9 TRUE  Cap and trade is the solu…    23 #0000… ""        ""    ""            1
## 10 TRUE  Climate change is caused …    17 #0000… ""        ""    ""            3
## 11 TRUE  Climate change is real an…    20 #0000… ""        ""    ""            3
## 12 TRUE  Emissions legislation sho…     9 #0000… ""        ""    ""            7
graph %>% 
  activate(edges) %>% 
  as_tibble()
## # A tibble: 17 x 3
##     from    to weight  
##    <int> <int> <chr>   
##  1     1    10 positive
##  2     1    12 negative
##  3     2     8 negative
##  4     2    12 negative
##  5     3     8 positive
##  6     4     8 negative
##  7     5     8 negative
##  8     5    12 positive
##  9     6     8 negative
## 10     6    10 positive
## 11     6    11 positive
## 12     6    12 positive
## 13     7     8 positive
## 14     7     9 positive
## 15     7    10 positive
## 16     7    11 negative
## 17     7    12 positive

Now the graph contains all information we need to recreate the bipartite plot from the manual but with coloured edges:

ggraph(graph, layout = "bipartite") +                            # create plot
  geom_node_point(aes(colour = color), size = 4) +               # add points for nodes
  geom_edge_link(aes(colour = weight), width = 1) +              # connect nodes where edges exist
  scale_colour_identity() +                                      # use color attribute to color nodes
  scale_edge_colour_manual(values = c("green", "red", "blue")) + # manually set colors for edges
  geom_node_text(aes(label = name), vjust = -3, size = 2) +      # add labels
  scale_y_continuous(expand = c(0.1, 0.1)) +                     # expand scale a little to prevent cut-off labels
  scale_x_continuous(expand = c(0.1, 0.1)) +
  theme_graph() +                                                # set a blank theme
  coord_flip() +
  theme(legend.position = "bottom")                              # move legend

As you can see, the commands basically look like ggplot2 functions except that they contain the words “node” and “edge” to tell ggraph if you want to work on nodes or edges. Hopefully, it makes sense at this point that aesthetics (the arguments inside aes()) for nodes use the data in the nodes table and the aesthetics for edges use data from the edges table of the graph.

leifeld commented 4 years ago

Thanks, @JBGruber. That was quite the reply!

(We should make this work automatically at some point. If only we weren't so busy.)

Any other questions, @phbre?

phbre commented 4 years ago

Thanks a lot, @JBGruber for the detailed replay. This helps a lot.

So, no more questions left.

Thanks again to you both.