Closed mathidachuk closed 4 years ago
Thanks I’ll start working on this. I have some ideas for the attributes and stuff. They’ll be part of hp_datatree()
Okay here is a working example of using hp_datatree()
on a YAML file:
library(hierplane) # devtools::install_github("r4fun/hierplane", "datatree-compatible")
library(data.tree)
library(yaml)
"
name: r4fun
tyler:
name: Tyler
job: Data Scientist
species: Human
toulouse:
name: Toulouse
job: Systems Engineer
species: Cat
jojo:
name: Jojo
job: Python Programmer
species: Dog
ollie:
name: Ollie
job: Database Administrator
species: Dog
lucas:
name: Lucas
job: R Programmer
species: Rabbit
" -> yaml
yaml %>%
yaml.load() %>%
as.Node() %>%
hp_datatree(
title = "r4fun github group",
link = "species",
attributes = "job"
) %>%
hierplane(
theme = "light",
width = "auto",
height = "auto"
)
Note, this requires the latest per defd52a. There was a bug where the root data.frame
that's appended to everything was appending an integer. This happened because when creating the data.frame
, I didn't call stringsAsFactors = FALSE
. The result was a link that didn't match/make sense. This also explains why it worked on my personal computer running R 4.0.2 but not my work computer running R 3.6.2!
Here is another example using the traditional data.tree
method for creating trees programmatically, as described here. These example also try to highlight some of the design decisions:
library(hierplane)
library(data.tree)
# acme
acme <- Node$new("Acme Inc.")
accounting <- acme$AddChild("Accounting")
software <- accounting$AddChild("New Software")
standards <- accounting$AddChild("New Accounting Standards")
research <- acme$AddChild("Research")
newProductLine <- research$AddChild("New Product Line")
newLabs <- research$AddChild("New Labs")
it <- acme$AddChild("IT")
outsource <- it$AddChild("Outsource")
agile <- it$AddChild("Go agile")
goToR <- it$AddChild("Switch to R")
acme$Accounting$`New Software`$cost <- 1000000
acme$Accounting$`New Accounting Standards`$cost <- 500000
acme$Research$`New Product Line`$cost <- 2000000
acme$Research$`New Labs`$cost <- 750000
acme$IT$Outsource$cost <- 400000
acme$IT$`Go agile`$cost <- 250000
acme$IT$`Switch to R`$cost <- 50000
acme$Accounting$`New Software`$p <- 0.5
acme$Accounting$`New Accounting Standards`$p <- 0.75
acme$Research$`New Product Line`$p <- 0.25
acme$Research$`New Labs`$p <- 0.9
acme$IT$Outsource$p <- 0.2
acme$IT$`Go agile`$p <- 0.05
acme$IT$`Switch to R`$p <- 1
acme$IT$Outsource$AddChild("India")
acme$IT$Outsource$AddChild("Poland")
acme$Set(type = c('company', 'department', 'project', 'project', 'department', 'project', 'project', 'department', 'program', 'project', 'project', 'project', 'project'))
# a simple hierplane of the acme tree
# * hp_datatree has explicit params for the settings because its more strict
# * e.g. we make a hard stance that the parent/child cols should always be from/to
# * In other words, I haven't found a use case for actually modifying any of the
# other columns (except maybe node_type?)
acme %>%
hp_datatree(
title = "Acme Inc.",
link = "type",
attributes = c("cost", "p")
) %>%
hierplane(
theme = "light",
width = "auto",
height = "auto"
)
# if we assign the link to the parent_id, we warn the user and set it
# back to the child_id
acme %>%
hp_datatree(
title = "Acme Inc.",
link = "from",
attributes = c("cost", "p")
) %>%
hierplane(
theme = "light",
width = "auto",
height = "auto"
)
# if the link is missing values (excluding the root row), we warn the user
# and set the link back to the child_id
acme %>%
hp_datatree(
title = "Acme Inc.",
link = "cost",
attributes = c("cost", "p")
) %>%
hierplane(
theme = "light",
width = "auto",
height = "auto"
)
I think you should allow user the freedom to control node type, in case they want to style their nodes.
Also what if users don't want a link label at all?
Also lol@stringsasfactors causing trouble.
ps who's jojo???
I agree, I'll explore more on the node type stuff. I am still waiting for the weekend to dive into hp_datatree some more. And regarding the link, I have been thinking of making the default link " ", so it's just empty.
Corgi JoJo: https://www.supercorgijojo.com/about
😅
I have an idea about resolving multiple children in yaml (or any other tree) inputs. I'll submit a PR to your branch if I can get it to work.
Sounds good to me. I think you mentioned adding an integer to duplicates. So child1, child2, etc. I think that would be a good idea.
Decided to take a slightly different approach. We should not modify the from and to columns. Instead we just add a separate column to use as child
. child
s do not have to be unique and we do not care about the paths when plotting.
This hp is generated using the code in PR #44. Notice how the repeated name "New Labs" does not impact tree generation. Furthermore, there is handling for if there are /
in the node names so we don't accidentally remove part of the node name. Let me know what you think.
Awesome, this works really well! As another example, we can have 2 Toulouse's now 😼
devtools::load_all() # load hierplane from this particular branch
library(data.tree)
library(yaml)
"
name: r4fun
tyler:
name: Tyler
job: Data Scientist
species: Human
toulouse:
name: Toulouse
job: Systems Engineer
species: Cat
toulouse:
name: Toulouse
job: Systems Engineer
species: Cat
ollie:
name: Ollie
job: Database Administrator
species: Dog
lucas:
name: Lucas
job: R Programmer
species: Rabbit
" -> yaml
yaml %>%
yaml.load() %>%
as.Node() %>%
hp_datatree(
title = "r4fun github group",
link = "species",
attributes = "job"
) %>%
hierplane(
theme = "light",
width = "auto",
height = "auto"
)
Here's a start:
The challenge remains that there is no way to set the node types and links per layer.