taowenmicro / ggClusterNet

Microbial ecological network visualization clustering
100 stars 45 forks source link

Weird variable names from edgeBuild() #18

Open Gian77 opened 8 months ago

Gian77 commented 8 months ago

Hello,

I am trying to generate networks using your ggClusterNet R package. I am having a hard time understanding why I have some my OTUs name appearing into the headers of the resulted edge object. Please see below.

> rare_ggclust_net2 <- ggClusterNet::model_igraph(cor = cor_rare,
+                                                 method = "cluster_fast_greedy", 
+                                                 seed = 12, 
+                                                 Top_M = 20)

> str(rare_ggclust_net2)
List of 3
 $ :'data.frame':   300 obs. of  3 variables:
  ..$ X1      : num [1:300] 1.59 4.89 -6.97 -5.49 9.54 ...
  ..$ X2      : num [1:300] 1.4 5.79 8.32 7.07 1.36 ...
  ..$ elements: chr [1:300] "OTU_1" "OTU_2" "OTU_3" "OTU_1835" ...
 $ :'data.frame':   300 obs. of  6 variables:
  ..$ orig_model: num [1:300] 7 8 9 10 11 3 1 12 13 14 ...
  ..$ model     : chr [1:300] "model_7" "model_8" "model_9" "model_10" ...
  ..$ color     : chr [1:300] "#FDB567" "#FDD07D" "#FEE695" "#FEF6B1" ...
  ..$ OTU       : chr [1:300] "OTU_1" "OTU_2" "OTU_3" "OTU_1835" ...
  ..$ X1        : num [1:300] 1.59 4.89 -6.97 -5.49 9.54 ...
  ..$ X2        : num [1:300] 1.4 5.79 8.32 7.07 1.36 ...
 $ :Class 'igraph'  hidden list of 10
  ..$ : num 300
  ..$ : logi FALSE
  ..$ : num [1:7973] 0 1 2 3 4 5 17 6 20 30 ...
  ..$ : num [1:7973] 0 1 2 3 4 5 5 6 6 6 ...
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
  ..$ :List of 4
  .. ..$ : num [1:3] 1 0 1
  .. ..$ : Named list()
  .. ..$ :List of 5
  .. .. ..$ name       : chr [1:300] "OTU_1" "OTU_2" "OTU_3" "OTU_1835" ...
  .. .. ..$ modularity : num [1:300] 7 8 9 10 11 3 1 12 13 14 ...
  .. .. ..$ label      : logi [1:300] NA NA NA NA NA NA ...
  .. .. ..$ color      : chr [1:300] "#FDB567" "#FDD07D" "#FEE695" "#FEF6B1" ...
  .. .. ..$ frame.color: chr [1:300] "#FDB567" "#FDD07D" "#FEE695" "#FEF6B1" ...
  .. ..$ :List of 3
  .. .. ..$ weight     : num [1:7973] 1 1 1 1 1 ...
  .. .. ..$ correlation: num [1:7973] 1 1 1 1 1 ...
  .. .. ..$ color      : chr [1:7973] "#FDB567" "#FDD07D" "#FEE695" "#FEF6B1" ...
  ..$ :<environment: 0x559d09955508> 

Then I generate nodes

> node_rare <- 
+   rare_ggclust_net2[[1]]

> head(node_rare)
                X1       X2 elements
OTU_1     1.585925 1.398903    OTU_1
OTU_2     4.890295 5.793571    OTU_2
OTU_3    -6.967662 8.316252    OTU_3
OTU_1835 -5.494072 7.068775 OTU_1835
OTU_5     9.542700 1.359537    OTU_5
OTU_4    -8.606428 6.845587    OTU_4

I generate the taxonomy

> taxonomy_rare <- 
+   physeq_rare_net %>%
+   vegan_tax() %>%
+   as.data.frame()

> head(taxonomy_rare)
           Kingdom       Phylum           Class        Order            Family            Genus                  Species
OTU_1    Eukaryota Streptophyta   Magnoliopsida       Poales           Poaceae          Setaria          Setaria viridis
OTU_2    Eukaryota Streptophyta   Magnoliopsida         <NA>              <NA>             <NA>                     <NA>
OTU_3    Eukaryota Streptophyta   Magnoliopsida Malpighiales     Euphorbiaceae             <NA>                     <NA>
OTU_1835 Eukaryota Streptophyta   Magnoliopsida       Poales           Poaceae             <NA>                     <NA>
OTU_5    Eukaryota Streptophyta   Magnoliopsida    Asterales        Asteraceae             <NA>                     <NA>
OTU_4    Eukaryota   Ascomycota Dothideomycetes Pleosporales Phaeosphaeriaceae Parastagonospora Parastagonospora nodorum

I add taxonomy and mean abundance to the nodes

> nodes_net = nodeadd(plotcord = node_rare,
+                     otu_table = otutable_rare,
+                     tax_table = taxonomy_rare)

> head(nodes_net)
               X1         X2 elements   Kingdom        Phylum             Class          Order         Family     Genus          Species       mean
OTU_1    1.585925  1.3989030    OTU_1 Eukaryota  Streptophyta     Magnoliopsida         Poales        Poaceae   Setaria  Setaria viridis 8574.82245
OTU_10  -6.163169 -9.7805123   OTU_10 Eukaryota  Streptophyta     Magnoliopsida         Poales        Poaceae Digitaria Digitaria exilis  244.55310
OTU_100  8.443559 -3.5408620  OTU_100 Eukaryota Basidiomycota   Pucciniomycetes    Pucciniales   Pucciniaceae      <NA>             <NA>   12.24806
OTU_101 11.854691 -2.6231469  OTU_101 Eukaryota Basidiomycota    Agaricomycetes     Agaricales           <NA>      <NA>             <NA>   10.77710
OTU_102  9.114853 -0.1619759  OTU_102 Eukaryota    Arthropoda         Arachnida Sarcoptiformes           <NA>      <NA>             <NA>   18.45071
OTU_103  7.771214 -6.5407734  OTU_103 Eukaryota Basidiomycota Ustilaginomycetes  Ustilaginales Ustilaginaceae      <NA>             <NA>   18.22406

And finally I generate the edglist that strangely give me some weird names as HEADER... Why OTU_1 and OTU_2 are considered variable names instead of variable labels?

> edges_net <- 
+   ggClusterNet::edgeBuild(cor = cor_rare,
+                           node = node_rare) %>% 
+   dplyr::select(-cor) %>% # not sure why cor did not work
+   mutate(direction = if_else(weight > 0, "positive", "negative"))

> head(edges_net)
         X2         Y2   OTU_2    OTU_1    weight        X1         Y1 direction
1 -8.606428   6.845587   OTU_4  OTU_124 0.8019252 -9.302795   5.740050  positive
2 -5.988042 -11.696418 OTU_839 OTU_3669 0.8489268 -5.500506 -10.835631  positive
3 -5.988042 -11.696418 OTU_839 OTU_2406 0.8079077 -5.611176 -10.502029  positive
4 -5.988042 -11.696418 OTU_839 OTU_1930 0.8513364 -5.644883 -10.621819  positive
5 -5.988042 -11.696418 OTU_839 OTU_1192 0.8057194 -5.117171  -9.954975  positive
6 -5.988042 -11.696418 OTU_839 OTU_3167 0.8315849 -5.087309 -10.538837  positive
>

Additionally when I use the any of the ploygon layout methods I got a weird output when the variable cor is repeaded twice with values that make no sense (e.g. -) or placed in the same place as the Order level. Please see below.

> rare_ggclust_net2 <- ggClusterNet::PolygonClusterG(cor = cor_rare,
+                                                    nodeGroup = net_rare)
> # Generate nodes
> node_rare <- 
+   rare_ggclust_net2[[1]]
> # Generate taxonomy
> taxonomy_rare <- 
+   physeq_rare_net %>%
+   vegan_tax() %>%
+   as.data.frame()
> # Add taxonomy, mean abundance to nodes
> nodes_net = nodeadd(plotcord = node_rare,
+                     otu_table = otutable_rare,
+                     tax_table = taxonomy_rare)
> edges_net <- 
+   ggClusterNet::edgeBuild(cor = cor_rare,
+                           node = nodes_net)
> head(edges_net)
         X2       Y2  OTU_2 Kingdom.x     Phylum.x       Class.x Order.x     cor   Genus.x        Species.x   mean.x    OTU_1     weight         X1        Y1 Kingdom.y          Phylum.y
1 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_3669 -0.8406234   1.220210 17.740636  Bacteria    Pseudomonadota
2 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_2406 -0.8567855   1.763356 17.427051  Bacteria    Pseudomonadota
3 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_1930 -0.8368905  16.033626 -2.194569  Bacteria    Pseudomonadota
4 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_2272 -0.8183972  11.042347 14.145455  Bacteria    Pseudomonadota
5 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_1192 -0.8025917  16.492326  6.651734 Eukaryota              <NA>
6 0.6237351 17.93444 OTU_10 Eukaryota Streptophyta Magnoliopsida  Poales Poaceae Digitaria Digitaria exilis 244.5531 OTU_2356 -0.8153681 -12.026043 -2.659660  Bacteria Verrucomicrobiota
              Class.y          Order.y          Family.y        Genus.y            Species.y    mean.y cor
1 Deltaproteobacteria     Myxococcales      Kofleriaceae     Haliangium uncultured bacterium 133.78123   -
2 Alphaproteobacteria Hyphomicrobiales              <NA>           <NA>                 <NA>  98.46497   -
3 Alphaproteobacteria Sphingomonadales Sphingomonadaceae   Sphingomonas uncultured bacterium  91.78690   -
4 Alphaproteobacteria Hyphomicrobiales  Nitrobacteraceae Bradyrhizobium                 <NA>  67.04703   -
5                <NA>             <NA>              <NA>           <NA>                 <NA>  66.33284   -
6      Spartobacteria             <NA>              <NA>           <NA>                 <NA>  62.33723   -

Thanks for your help,

Gian