Closed ntamas closed 2 years ago
Confirmed; also happens on my machine. However, I'm not familiar with the R interface so we'll have to wait for @gaborcsardi to chime in.
It is indeed a bug.
any suggestion please, i have the same problem.
Still present in the 1.3 dev version.
I'm not familiar with R enough to figure out what's going on, but apparently replacing this line:
lapply(res, function(x) V(graph1)[x + 1])
with this line
lapply(res, function(y) V(graph1)[y + 1])
in graph.get.subisomorphisms.vf2
in R/topology.R
"fixes" the problem -- but of course you then have a problem with vertex attributes named y
.
So here's a simpler way to reproduce this:
> g <- graph.full(4)
> V(g)$x <- 1:4
> x <- 3
> V(g)[x + 1]
Error in simple_vs_index(x, ii, na_ok) : Unknown vertex selected
And apparently the problem lies in a feature of V(g)
indexing that I was not aware of before:
> g <- graph.full(4)
> V(g)$x <- 1:4
> V(g)$y <- 4:1
> V(g)[x >= 2 & y >= 2]
In this case, the expression x >= 2 & y >= 2
gets evaluated in a context where x
and y
refer to the corresponding vertex attributes of the graph. The magic happens in this part of R/iterators.R
:
attrs <- vertex_attr(graph)
xvec <- as.vector(x)
for (i in seq_along(attrs)) attrs[[i]] <- attrs[[i]][xvec]
res <- lazy_eval(
args,
data = c(attrs, .nei = .nei, nei = nei, .innei = .innei,
innei = innei, .outnei = .outnei, outnei = outnei, adj = adj,
.inc = .inc, inc = inc, .from = .from, from = from, .to = .to,
to = to)
)
res <- lapply(res, function(ii) {
if (is.null(ii)) return(NULL)
ii <- simple_vs_index(x, ii, na_ok)
attr(ii, "env") <- attr(x, "env")
attr(ii, "graph") <- attr(x, "graph")
class(ii) <- class(x)
ii
})
where we basically retrieve all the vertex attributes of the graph, subset them to the current vertex sequence, and then create a lazy object with an associated data frame that is pre-filled with the vertex attributes and some other stuff (nei
, from
etc). This is nice but it interferes with the evaluation of the expression in the brackets in V(g)[...]
because we first look things up in this data frame and only look at the environment of the lazy expression when the lookup fails. Actually, this is even documented behaviour (source):
When indexing vertex sequences, vertex attributes can be referred to simply by using their names. E.g. if a graph has a name vertex attribute, then V(g)[name == "foo"] is equivalent to V(g)[V(g)$name == "foo"].
A workaround for this is to use a "private" variable name like .x
in the code, e.g.: lapply(res, function(.x) V(graph1)[.x + 1]
, with the implicit assumption that users won't create vertex attribute names starting with a dot. However, this will still be a common pitfall for people who routinely write code like function(x) V(g)[do_something_with_x]
.
The workaround is now used and I think it's good enough for our purposes.
Of course this still fails if there is an attribute called .x
.
Yes, but by convention, names starting with a dot are considered private in R. E.g., ls()
in R does not list variables starting with a .
unless all.names=TRUE
is passed to it. That's why I said that it's "good enough" for our purposes. If you want to break it, surely you can break it :)
This is about attribute names, though, not R variables. Attributes may come from sources outside of R, such as when importing files.
For example: When I was thinking about how to handle compound attributes in GML files, one of the ideas was to use the same notation as in the GML specs. This naturally leads to encoding top-level attributes names to strings that start with dots. I no longer think that this is a good idea, but it shows the problem: at the time I was working with the C core, not thinking about R, and not aware of this convention in R.
Still, we can't change the behaviour of the [
operator now without breaking lots of existing code for users, and most of that code is probably in private R notebooks (and not in library code -- at least the revdep checks would discover those). I believe that whoever knows that attributes can be referred to in the [
operator is happy to use this feature (I find it quite convenient, albeit a bit confusing at first), so I don't think it's a niche thing that we could just remove.
What we can do, though, is to go through the library code of igraph and make sure that we use the [
operator in a way that existing variables do not conflict with attributes at the point of usage.
Commit 4ccd00f7 adds a resolve_attrs=...
option to vertex / edge indexing so you can turn attribute lookup off in problematic cases.
From @barryrowlingson on May 25, 2016 17:48
Make a graph and look for squares:
Add some vertex attributes, all still works:
Unless its called
x
:I'm getting a bit of a deja-vu here, have I reported something like this before?
igraph v 1.0.1, R 3.2.2
Copied from original issue: igraph/igraph#945