Open dsschneidermann opened 1 year ago
Can you give an example of this issue? I am not sure that I understand your explanation of the issue. That said, I think any duplicate edge representation is caused not by the code you referenced but by this portion of the code:
e_in =
Enum.flat_map(v_all, fn v2_id ->
case Map.get(meta, {v2_id, v_id}) do
nil ->
[]
edge_meta when is_map(edge_meta) ->
v2 = Map.get(vs, v2_id)
for {label, weight} <- edge_meta do
Edge.new(v2, v, label: label, weight: weight)
end
end
end)
e_out =
Enum.flat_map(v_all, fn v2_id ->
case Map.get(meta, {v_id, v2_id}) do
nil ->
[]
edge_meta when is_map(edge_meta) ->
v2 = Map.get(vs, v2_id)
for {label, weight} <- edge_meta do
Edge.new(v, v2, label: label, weight: weight)
end
end
end)
e_in ++ e_out
Concatenating the two lists allows for duplicate entries. This makes sense as intended behavior for a directed graph but maybe not for an undirected graph. I suspect if we wanted to avoid duplicates the solution is something like:
Enum.reduce(v_all, MapSet.new(), fn v2_id, es -> get_edges(v2_id, v_id, meta) |> MapSet.union(es) end) |> Enum.to_list()
...
defp get_edges(v2_id, v_id, meta) do
[{v2_id, v_id}, {v_id, v2_id}]
|> Enum.reduce(MapSet.new(), fn {vi_id, vo_id} = vs, edges ->
case Map.get(meta, vs) do
nil ->
edges
edge_meta when is_map(edge_meta) ->
v = Map.get(vs, vi_id)
v2 = Map.get(vs, vo_id)
for {label, weight} <- edge_meta do
Edge.new(v, v2, label: label, weight: weight)
end
|> MapSet.new()
|> MapSet.union(edges)
end
end
end
Haven't tested the above just sort of thinking out loud. Can put a PR together if this approach seems to address a real problem, but I'm not sure that it does for a directed graph.
Hello, yes sorry I can see it is not very clear :)
Here is a reproduction:
defmodule GraphingTest do
use ExUnit.Case, async: true
test "graphlib" do
{graph, edges} =
for metric_key <- ["a", "b"], reduce: {Graph.new(), []} do
acc ->
input_edges =
for input <- ["a"] do
{input, metric_key}
end
{elem(acc, 0) |> Graph.add_vertex(metric_key), input_edges ++ elem(acc, 1)}
end
IO.puts("expected: ")
graph =
for {v1, v2} <- edges, reduce: graph do
acc ->
{v1, v2} |> inspect |> IO.puts()
Graph.add_edge(acc, %Graph.Edge{v1: v1, v2: v2})
end
IO.puts("actual: ")
for edge <- graph |> Graph.edges("a") do
case edge do
%{v1: v1, v2: v2} -> {v1, v2} |> inspect |> IO.puts()
end
end
end
end
Output:
expected:
{"a", "b"}
{"a", "a"}
actual:
{"a", "a"}
{"a", "a"}
{"a", "b"}
Double "a"->"a" is unexpected here. I ended up matching on the source information instead of using the graph, but I can tell from small testing that it only happens when asking specifically for Graph.edges("a")
edges - not when getting all Graph.edges()
I think that is the expected behavior given docs for Graph.edges/2
:
Returns a list of all edges inbound or outbound from vertex
v
.
For an edge with a vertex incident upon itself you would have the duplication, whereas with Graph.edges/1
it is only considering the out-edges and not yielding the duplicate. I still think this makes sense for directed graphs since you need to consider a -> a
as both directions being valid.
I have to disagree with the expectation here. It may be technically correct according to the wording of the docs, but it's a really weird edge case (hah) to actually support. What's the beneficial use of having the same edge returned twice? There simply isn't two edges in a directed graph A -> A
, or in an undirected graph A - A
.
Ultimately it is your choice to make, but I would definitely ask it flagged it in the docs specifically. IMO any reading of the docs will always assume that Graph.edges/1
returns a subset of Graph.edges/0
.
Also, for what its worth, I definitely understand any hesitation on your part to add any fixes as a breaking change, but maybe it could be scheduled as a future change.
Just to be clear, I'm not the maintainer, just an interested user who is happy to contribute pull requests for issues. I think this is an easy enough fix that would not be a breaking change. If there is agreement that A -> A
is always considered a single edge I can try to work on the fix. My real job is a bit crazy at the moment, so I probably wouldn't have time until after 2-3 weeks.
Hi and first of all, thanks for maintaining this library. It has turned out to be essential to me since I started using it.
I'm a big fan of pattern matching over edges in the graph and it's a very intuitive way to write complicated case handling. However I just discovered that I'm getting duplicated edges when I am matching on, eg.
From a glance at the code, a possible cause could be if a self-referencing edge could be described by both
v_in
andv_out
: https://github.com/bitwalker/libgraph/blob/15ff0b9ba8c22a9dfec5bc04096fb7025e58f34b/lib/graph.ex#L451But the description would have to differ, otherwise the MapSet would make it unique.. Do you think this is a correct cause and if so, is it an intended one?
I don't suppose there is any inherent disadvantage in just matching instead on all edges as such:
Looking at the code, it seems to be pretty much the same amount of work being done -- and in my situation, I always need to flat_map anyway in what I am doing.
However, someone else may be tripped up on it and have a situation where getting duplicates could cause confusion down the line.
Best regards, Dennis