igraph / rigraph

igraph R package
https://r.igraph.org
543 stars 201 forks source link

Inconsistent defaults for the 'mode' parameter #622

Open szhorvat opened 1 year ago

szhorvat commented 1 year ago

Describe the bug

The default value of the common mode parameter differs from function to function. In some functions it's out, in others it's all. This is a usability bug, as it makes igraph unpredictable.

To reproduce

In shortest_paths(), the default is out. In the closely related distances(), the default is all.

You can use a grep line such as rg 'mode *= *c\\(.*?\\)' to see many of the defaults.

Grep output ```r scan.R 106: weighted = FALSE, mode = c("out", "in", "all"), iterators.R 514: .nei <- function(v, mode = c("all", "in", "out", "total")) { 539: .innei <- function(v, mode = c("in", "all", "out", "total")) { 546: .outnei <- function(v, mode = c("out", "all", "in", "total")) { structural.properties.R 200: mode = c("all", "out", "in", "total"), loops = TRUE, 418: mode = c("all", "out", "in"), 506: mode = c("out", "all", "in"), 602: mode = c("out", "all", "in"), 675:subcomponent <- function(graph, v, mode = c("all", "out", "in")) { 1048: mode = c("default", "ratio")) { 1114: mode = c("all", "out", "in"), mindist = 0) { 1204: mode = c("all", "out", "in"), mindist = 0) { 1234: mode = c("all", "out", "in"), mindist = 0) { 1292:coreness <- function(graph, mode = c("all", "out", "in")) { 1342:topo_sort <- function(graph, mode = c("out", "all", "in")) { 1615:bfs <- function(graph, root, mode = c("out", "in", "all", "total"), 1780:dfs <- function(graph, root, mode = c("out", "in", "all", "total"), 1889:components <- function(graph, mode = c("weak", "strong")) { 1950:unfold_tree <- function(graph, mode = c("all", "out", "in", "total"), roots) { paths.R 58: mode = c("out", "in", "all", "total"), interface.R 272:neighbors <- function(graph, v, mode = c("out", "in", "all", "total")) { 314:incident <- function(graph, v, mode = c("all", "out", "in", "total")) { 512: mode = c("out", "in", "all", "total")) { 555: mode = c("out", "in", "all", "total")) { flow.R 546:dominator_tree <- function(graph, root, mode = c("out", "in", "all", "total")) { rewire.R 131:each_edge <- function(prob, loops = FALSE, multiple = FALSE, mode = c("all", "out", "in", "total")) { components.R 27:count_components <- function(graph, mode = c("weak", "strong")) { 101:decompose <- function(graph, mode = c("weak", "strong"), max.comps = NA, layout.R 432: rootlevel = numeric(), mode = c("out", "in", "all"), conversion.R 362:as.undirected <- function(graph, mode = c("collapse", "each", "mutual"), edge.attr.comb = igraph_opt("edge.attr.comb")) { 423: mode = c("all", "out", "in", "total"), 463: mode = c("all", "out", "in", "total"), centrality.R 257: mode = c("out", "in", "all", "total"), weights = NULL, 295:estimate_closeness <- function(graph, vids = V(graph), mode = c("out", "in", "all", "total"), cutoff, weights = NULL, normalized = FALSE) { incidence.R 198: mode = c("all", "out", "in", "total"), centralization.R 150:centr_degree_tmax <- function(graph = NULL, nodes = 0, mode = c("all", "out", "in", "total"), loops = FALSE) { aaa-auto.R 35:graph_from_adj_list <- function(adjlist, mode=c("out", "in", "all", "total"), duplicate=TRUE) { 346:harmonic_centrality <- function(graph, vids=V(graph), mode=c("out", "in", "all", "total"), weights=NULL, normalized=FALSE, cutoff=-1) { 683:knn <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total"), neighbor.degree.mode=c("all", "out", "in", "total"), weights=NULL) { 708:strength <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total"), loops=TRUE, weights=NULL) { 748:centr_degree <- function(graph, mode=c("all", "out", "in", "total"), loops=TRUE, normalized=TRUE) { 791:centr_clo <- function(graph, mode=c("out", "in", "all", "total"), normalized=TRUE) { 805:centr_clo_tmax <- function(graph=NULL, nodes=0, mode=c("out", "in", "all", "total")) { 906:eccentricity <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total")) { 922:radius <- function(graph, mode=c("all", "out", "in", "total")) { 958:random_walk <- function(graph, start, steps, mode=c("out", "in", "all", "total"), stuck=c("return", "error")) { 979:random_edge_walk <- function(graph, start, steps, weights=NULL, mode=c("out", "in", "all", "total"), stuck=c("return", "error")) { 1029:local_efficiency <- function(graph, vids=V(graph), weights=NULL, directed=TRUE, mode=c("all", "out", "in", "total")) { 1054:average_local_efficiency <- function(graph, weights=NULL, directed=TRUE, mode=c("all", "out", "in", "total")) { 1116:is_connected <- function(graph, mode=c("weak", "strong")) { 1336:similarity.jaccard <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total"), loops=FALSE) { 1351:similarity.dice <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total"), loops=FALSE) { 1366:similarity.invlogweighted <- function(graph, vids=V(graph), mode=c("all", "out", "in", "total")) { 1472:as.directed <- function(graph, mode=c("mutual", "arbitrary", "random", "acyclic")) { 1569:dominator_tree <- function(graph, root, mode=c("out", "in", "all", "total")) { 2278:is_tree <- function(graph, mode=c("out", "in", "all", "total"), details=FALSE) { games.R 1049:connect <- function(graph, order, mode = c("all", "out", "in", "total")) { 1285: directed = FALSE, mode = c("out", "in", "all")) { make.R 973:make_star <- function(n, mode = c("in", "out", "mutual", "undirected"), 1168:make_tree <- function(n, children = 2, mode = c("out", "in", "undirected")) { 1523: mode = c("all", "out", "in")) { ```

Notes

In analysis functions which take a graph as input, the most reasonable default is almost always out. This basically means that edge directions are not ignored: directed graphs are treated as directed, undirected ones as undirected. It should go with a default value of TRUE for the directed parameter. (Note that some graph generators also have a directed parameter. That's not what we are discussing here, as those functions create a graph instead of taking one as input.)

There are a very few special cases where ignoring edge directions by default might make sense. For example, one might argue that this is the case with modularity(), as most community detection functions in igraph don't consider edge directions anyway. However, modularity() actually has directed=TRUE as the default. I can't currently think of any other functions where ignoring edge directions by default makes any sense.

Version information

Current dev, to become 1.3.6

ntamas commented 1 year ago

This has been a long-standing issue, but unfortunately it's hard to fix without breaking compatibility with older versions, that's why it's been put aside for years. Changing the defaults would almost surely break some of the reverse dependencies.

I guess the best we can do in the mid-term is to 1) decide for which functions it makes sense to change the default (which would probably always mean switching from all to out), 2) print a warning that the default is about to change for cases when the user did not provide the value of the argument explicitly (if there is a way to detect this), 3) wait a few versions so we can claim to CRAN maintainers that we've given due notice to end users and then 4) change the default after a few months' time.