Open urbanmatthias opened 4 years ago
Hi @urbanmatthias
Yep, you're right. That union operator does work differently on a Graph than it does on a set, and it does look like its that way on purpose.
I don't know what is more "correct" here.
I think the idea behind creating a new graph on this operation is to avoid polluting an existing graph. Or the graph a
might be read-only, so the most consistent and reliable way of completing the union would be to create a new graph and union into that.
Note, I found in my testing that a += c
does do what you'd expect a |= c
to do. But I think that is wrong too, because +=
should add a single triple or a list of triples, where |=
should union the graphs as it does for a set.
@nicholascar @white-gecko
Do you guys have any opinion on this?
My thoughts for changes in RDFLib v6.0.0 are:
a |= c
(where a is a graph and c is a second graph) should should modify and write into a, without creating a new grapha += (s, p, o)
should be the same as a.add((s,p,o))
a += [(s1,p1,o1), (s2,p2,o2)]
should be the same as a.addN([(s1,p1,o1), (s2,p2,o2)])
I agree with @ashleysommer suggestions to change this in v6.
Some more context from the python stdlib:
The <operator>=
like +=
or |=
are called "in place" in python and for mutable objects (like sets) it means that the left-hand object is changed. I don't think the python convention is that "in place" means the left-hand MUST be mutated (so the current implementation is not wrong) but CAN or SHOULD (for performance reasons).
I think the idea behind creating a new graph on this operation is to avoid polluting an existing graph.
As in-place operators do "pollute" objects with standard types I don't think this is a behaviour is expected. If need a = a + c
can still be used.
Or the graph a might be read-only, so the most consistent and reliable way of completing the union would be to create a new graph and union into that.
The standard library does create new objects for immutable objects, like tuples:
>>> a = (1, 2)
>>> a + (3, 4)
(1, 2, 3, 4)
>>> a += (3, 4)
>>> a
(1, 2, 3, 4)
I think changing this for v6 would be a good idea. I would expect the in place operators to actually work in place.
Actually I do not understand, what could be the difference between +=
and |=
on graphs. I would expect both to behave in the same way, also if left and right are graphs or left is a graph and right is a triple. Is there a difference for sets between +=
and |=
?
@white-gecko
sets do not support +=
Updating |=
to perform an in-place union would be nice. I believe it's doing an update
rather than a union
if we are going by set
's semantics.
I'd like if rdflib could keep current |=
behavior but as |
and/or Graph.union
- this would mirror the behavior of set and would give a migration path for code relying on |=
's current behavior.
Hi,
I have a question concerning the
|=
operator. It seems to me that it behaves differently with rdflib graphs than it does with sets. While|=
performs an in-place union when used with sets, rdflib creates a new Graph when used with Graphs. Is this on purpose?See this minimal example: