Closed calummaney closed 2 months ago
Huh. This looks like it's due to plot.psem()
getting the info for the nodes from coefs()
. But, coefs()
here yields
Response Predictor Estimate Std.Error DF Crit.Value P.Value Std.Estimate
1 y x 0.0304 0.1786 97 0.1702 0.8652 -
2 y cat - - 1 1.4748 0.2275 -
3 y cat = 2 9.714 0.2681 97 36.2392 0.0000 - ***
4 y cat = 1 10.1758 0.2681 97 37.9618 0.0000 - ***
I also noticed that even after filtering out that row, a bunch of other issues come up. This was super useful to mention, and now I have implemented several fixes in the function. One of which is the default for plot.psem()
is standardized coefs (show = "std"
), but, you don't have them here! So I made a new warning message.
OK - let's fix it! @jslefche can you update this new plot.psem()
code in whatever branch you have going?
plot.psem <- function (x, return = FALSE,
node_attrs = data.frame(shape = "rectangle",
color = "black",
fillcolor = "white"),
edge_attrs = data.frame(style = "solid",
color = "black"),
ns_dashed = T, alpha = 0.05, show = "std",
digits = 3, add_edge_label_spaces = TRUE, ...)
{
ctab <- coefs(x)
# remove categorical predictors with ANOVA results with no estimate
ctab <- ctab[-which(ctab$Estimate=="-"),]
# std estimate error
if(show == "std" & "-" %in% ctab$Std.Estimate) stop("Std. Estimates requested, but some paths cannot have Std. Estimates calculated.\nSee coefs() of your model and consider setting show to 'unstd'.\n")
# in case categorical variabes were there but are now gone
ctab$Estimate <- as.numeric(ctab$Estimate)
ctab$Response <- as.character(ctab$Response)
ctab$Predictor <- as.character(ctab$Predictor)
unique_nodes <- unique(c(ctab$Response, ctab$Predictor))
nodes <- DiagrammeR::create_node_df(n = length(unique_nodes), nodes = unique_nodes,
type = "lower", label = unique_nodes)
nodes <- cbind(nodes, node_attrs)
nodes[] <- lapply(nodes, as.character)
nodes$id <- as.numeric(nodes$id)
edges <- DiagrammeR::create_edge_df(from = match(ctab$Predictor, unique_nodes),
to = match(ctab$Response, unique_nodes))
edges <- data.frame(edges, edge_attrs)
edges[] <- lapply(edges, as.character)
edges$id <- as.numeric(edges$id)
edges$from <- as.numeric(edges$from)
edges$to <- as.numeric(edges$to)
if (ns_dashed)
edges$style[which(ctab$P.Value > alpha)] <- "dashed"
if (show == "std")
edges$label = round(ctab$Std.Estimate, digits)
if (show == "unstd")
edges$label = round(ctab$Estimate, digits)
if (add_edge_label_spaces)
edges$label = paste0(" ", edges$label, " ")
sem_graph <- DiagrammeR::create_graph(nodes, edges, directed = TRUE)
if (return)
return(sem_graph)
DiagrammeR::render_graph(sem_graph, ...)
}
Now, if you ran you code above, you get this error:
Error in plot.psem(psem(lm(y ~ x + cat, data = df), data = df)) :
Std. Estimates requested, but some paths cannot have Std. Estimates calculated.
See coefs() of your model and consider setting show to 'unstd'
But if we run
df <- data.frame(y = rnorm(n = 100, 10,2), x = rnorm(n=100,5,1),cat = factor(c(1,2)))
psem(
lm(y ~ x + cat, data = df),
data=df) |> plot(show = "unstd")
Victory!
As a further thought for @jslefche
coefs()
- I'm a little puzzled by that. plot.psem()
to remove categorical variables? I ask this as, for myself anyway, if I'm using Econometric Fixed Effects to handle causality problems, sometimes I have a TON of levels and really don't want to see any of it. I suppose this would mean filtering out anything with " = " from the ctab? Or did you implement some other schema for categorical variables? Hrm. Actually. This could get tricky if there were some categorical variables you DID want to see. So, perhaps a better option - and generalizable to more cases, would be a "remove_node" option.Actually - ok, here's a version that does #2. Although it's a bit greedy (i.e., if you removed a node and another node had a part of the variable name in it, that would not work - but this is to get rid of all of the levels..... think about this more, maybe. I just woke up and haven't had my coffee)
plot.psem <- function (x, return = FALSE,
node_attrs = data.frame(shape = "rectangle",
color = "black",
fillcolor = "white"),
edge_attrs = data.frame(style = "solid",
color = "black"),
ns_dashed = T, alpha = 0.05, show = "std",
remove_nodes = NA,
digits = 3, add_edge_label_spaces = TRUE, ...)
{
ctab <- coefs(x)
# remove categorical predictors with ANOVA results with no estimate
ctab <- ctab[-which(ctab$Estimate=="-"),]
# Are there other nodes we don't want?
# add a layer of error checking
if(!is.na(remove_nodes) & !sum(grepl(remove_nodes,
unique(c(ctab$Response, ctab$Predictor))))>0){
warning(paste0(remove_nodes, "not in ",
unique(c(ctab$Response, ctab$Predictor)),
".\nTypo?"))
}
if(!is.na(remove_nodes) & sum(grepl(remove_nodes, ctab$Predictor))>0){
ctab <- ctab[!grepl(remove_nodes, ctab$Predictor),]
}
if(!is.na(remove_nodes) & sum(grepl(remove_nodes, ctab$Response))>0){
ctab <- ctab[-grepl(remove_nodes, ctab$Response),]
}
# std estimate error
if(show == "std" & "-" %in% ctab$Std.Estimate) stop("Std. Estimates requested, but some paths cannot have Std. Estimates calculated.\nSee coefs() of your model and consider setting show to 'unstd'.\n")
# in case categorical variabes were there but are now gone
ctab$Estimate <- as.numeric(ctab$Estimate)
ctab$Response <- as.character(ctab$Response)
ctab$Predictor <- as.character(ctab$Predictor)
# make nodes
unique_nodes <- unique(c(ctab$Response, ctab$Predictor))
nodes <- DiagrammeR::create_node_df(n = length(unique_nodes), nodes = unique_nodes,
type = "lower", label = unique_nodes)
nodes <- cbind(nodes, node_attrs)
nodes[] <- lapply(nodes, as.character)
nodes$id <- as.numeric(nodes$id)
edges <- DiagrammeR::create_edge_df(from = match(ctab$Predictor, unique_nodes),
to = match(ctab$Response, unique_nodes))
edges <- data.frame(edges, edge_attrs)
edges[] <- lapply(edges, as.character)
edges$id <- as.numeric(edges$id)
edges$from <- as.numeric(edges$from)
edges$to <- as.numeric(edges$to)
if (ns_dashed)
edges$style[which(ctab$P.Value > alpha)] <- "dashed"
if (show == "std")
edges$label = round(ctab$Std.Estimate, digits)
if (show == "unstd")
edges$label = round(ctab$Estimate, digits)
if (add_edge_label_spaces)
edges$label = paste0(" ", edges$label, " ")
sem_graph <- DiagrammeR::create_graph(nodes, edges, directed = TRUE)
if (return)
return(sem_graph)
DiagrammeR::render_graph(sem_graph, ...)
}
FYI, with this last bit
psem(
lm(y ~ x + cat, data = df),
data=df) |> plot(show = "unstd", remove_nodes = "cat")
produces
Works pretty well, cheers! Hope this can be incorporated.
Hello,
Many thanks for your continued work on this project - it's amazingly useful for the project I'm working on.
I'm hoping you can help me out with an issue I'm seeing when trying to plot a fitted psem model. I include an exogenous categorical variable but plotting the resulting model doesn't seem to work. The following simple reprex throws an error:
"Error in round(ctab$Std.Estimate, digits) : non-numeric argument to mathematical function"
Please let me know if I am making a mistake with this sort of thing, or what may be causing the errors.
Thanks very much if you are able to help!
Cal