Closed Xrayez closed 2 years ago
Is it possible to inherit the classes and get the class instead of the GraphVertex and GraphEdge? RoadIntersection inherits GrahVertex RoadSegment inherits GraphEdge maybe when the graph is being created with _init() maybe I could pass the two classes and use it in graph, is it possible?
Is it possible to inherit the classes and get the class instead of the GraphVertex and GraphEdge?
In theory this is possible, and yeah I think it makes sense to add.
When you call add_edge()
, you could pass your own GraphVertex
rather than letting the method to construct a new vertex pair from values. However, add_edge
would still return base GraphEdge
since it needs to create something.
The current workaround would be to assign your own RoadIntersection
and RoadSegment
(that extend whatever) as a value of GraphVertex.value
and GraphEdge.value
.
The idea here is that a Graph
should take ownership over those objects, even if they are created externally. However, you can also call GraphVertex.free()
and GraphEdge.free()
, and the Graph
will still be able to update the structure, so there's no way to break something and Graph
will cleanup everything upon destruction.
maybe when the graph is being created with _init() maybe I could pass the two classes and use it in graph, is it possible?
There's no way to override _init()
in C++, unfortunately. See some issues in godotengine/godot-proposals#1522.
Since you'd like to have your own versions of GraphVertex
and GraphEdge
, how about extending a Graph
as well? I'm thinking that we could add _create_vertex()
and _create_edge()
as virtual methods that you could override for your own purposes:
extends Graph
func _create_vertex() -> GraphVertex:
return RoadIntersection.new()
func _create_edge() -> GraphEdge:
return RoadSegment.new()
If we look at AStar
, there will likely be other virtual methods to be implemented.
Some performance notes on the implementation. I'm not particularly going to use this, but I figured I'd post here since it's a bit interesting.
I see you use List
in your functions, but why? Linked lists perform a lot of memory allocations and take more space than vectors, and there are so many ways to just use a vector or deque instead that I have yet to find a common use case for them, even though I made a graph class myself (for a DAG, in a node-based tool). If the only reason is because of being able to remove in the middle of iteration cheaply, you can also do that in vectors by swapping with the last element (order is rarely a concern, but if it is I guess you have to shift items, but even with lists you have to iterate items anyways to find those you want, see remove_if). If lists are reaaaally needed here, I would at least suggest using a pool allocator in the future maybe.
The same goes for vertices and edges, they are heap-allocated full-fledged Objects, when I would probably have made them optional (as metadata?), so by default they are lightweight and maybe not heap-allocated (note, Maps already heap-allocate their elements internally even if they are just pointers), but if you want to go for OOP and let users access vertices as objects or even inherit vertex classes with scripts, it might be a better option, though will be less efficient with large graphs.
In other areas like remove_vertex
you also use Set
but I'm not sure why. Is this because you expect edges to be found multiple times? If you have a way to find all edges connected to a vertex then there would be no chance to find twice the same one, which means no need for a Set and a temporary vector can be used (here I only iterate through vectors https://github.com/Zylann/godot_voxel/blob/d1da685faccbeb17a2bb2b2209d68368b83f7b87/generators/graph/program_graph.cpp#L58). There is an ordered-removal from a list of connections, but even if List was used instead it would have had to iterate it either way).
Why use Map
in graph data, instead of HashMap
? HashMap
lookup is faster.
Also, the fact you use a pointer as key means querying a vertex will require such object and even if you delete and recreate, the same key might appear. For example that means you can't have IDs that remain unique after alterations of the data, and serializing the data might need extra work as you'll need to remap pointers.
Note about temporary data structures: if a frequently used algorithm needs a vector temporarily, a cheap trick to speed it up is:
static thread_local std::vector<Type> s_data;
The Godot equivalent would be static thread_local LocalVector s_data
. The idea is that memory is re-used between runs so if the function has to be called often then allocations won't cost anything after a short while.
If the only reason is because of being able to remove in the middle of iteration cheaply, you can also do that in vectors by swapping with the last element
I guess it depends on whether sparse graphs will be used. I'm less experienced with vector tricks... 😢
but if you want to go for OOP and let users access vertices as objects or even inherit vertex classes with scripts, it might be a better option, though will be less efficient with large graphs.
Yeah, flexibility, ease-of-use have higher priority here. It would also make sense to have different kinds of graphs implemented for undirected, directed, multigraph etc. for even better performance given concrete assumptions, but I'd just like to have something usable at first and not to rely on and hack AStar
class for everything in Godot. 🙂
This also makes it easier to use hierarchical/nested graphs.
For large, dense graphs, perhaps we could implement LargeGraph
at some point.
In other areas like
remove_vertex
you also useSet
but I'm not sure why. Is this because you expect edges to be found multiple times? If you have a way to find all edges connected to a vertex then there would be no chance to find twice the same one, which means no need for a Set and a temporary vector can be used
Note that GraphEdge
is an actual object here, not just a relationship. When add_edge(a, b)
gets called, it instantiates a GraphEdge
object, which is inserted into adjacency list of the a
and b
vertices. The same applies with directed edges for get_successors()
and get_predecessors()
to work. So indeed, it's very likely the same edge is going to be present in several lists. I haven't come up how to solve this more efficiently yet.
Why use
Map
in graph data, instead ofHashMap
?HashMap
lookup is faster.
Map
was easier to use and I was already familiar with it. I'll likely replace it with HashMap
eventually.
Also, the fact you use a pointer as key means querying a vertex will require such object and even if you delete and recreate, the same key might appear. For example that means you can't have IDs that remain unique after alterations of the data, and serializing the data might need extra work as you'll need to remap pointers.
Wouldn't this be the case if IDs are used as well? I mean, you must have a key anyway. What do you mean by "same key might appear" problem? Note that even if you attempt to free the vertex/edge object, those will request the graph to erase keys properly and won't remain dangling.
But yeah, I'll likely make Graph
extend Resource
eventually. Importing from other formats like GraphML could also be a possibility maybe...
The Godot equivalent would be
static thread_local LocalVector s_data
Thanks, I'll see how and when this may speed up things.
A lot of this is a learning experience for me, however current implementation seems to work okay-ish so far.
A general-purpose Graph
structure could easily become hairy, however I do realize that certain optimizations could be done without hurting usability. My aim is certainly not to replicate graphs from Boost library (at least for now). That would require heavy metaprogramming which is not compatible with how Godot works, not mentioning my skillset... 😛
Wouldn't this be the case if IDs are used as well? I mean, you must have a key anyway. What do you mean by "same key might appear" problem? Note that even if you attempt to free the vertex/edge object, those will request the graph to erase keys properly and won't remain dangling.
You don't control the value of pointers, but you can control IDs. It can either be auto increment (which will be unique for as long as you need) or even user-defined if that makes some things easier. It is also easier to serialize (to file or network), while serializing a key will be harder if they are objects because the pointer won't carry over to the file or network. Most likely the user will have to maintain a map themselves. About the same key appearing twice, I think it's only a matter in some particular C++ cases because the pointer might come up again later for a different object (unless you never delete them, think for Undo/Redo for example), thats not going to be noticeable in GDScript actually.
About the same key appearing twice, I think it's only a matter in some particular C++ cases because the pointer might come up again later for a different object (unless you never delete them, think for Undo/Redo for example), thats not going to be noticeable in GDScript actually.
Yes, that's why I raised the question, because it's an implementation detail as of now. But as you say, introducing IDs would be needed for serialization purposes, so I'm likely going to refactor the implementation to enable easier serialization.
About the undo/redo part, this marginally reminded me of #162 where I keep the history of draw commands. Perhaps using IDs could help with visualizing/debugging algorithms. 🤔
I've done some major refactor to use HashMap
. I've split the data into vertices
and edges
. I've also figured that neighbors can be kept track of during insertion. Some methods simplified down to one-liners. The code is 50 lines shorter now. I'll need to move get_neighbors()
and family to GraphVertex
class itself.
Still using lists for edges, but I really wonder what kind of performance impact this has. I'll leave this for the final touch and/or when I feel motivated enough to do this sort of optimization with vectors... 😭
Unfortunately, I'm not sure whether I've noticed major performance improvements. But I think the current state is certainly an improvement structurally. I've only referred to AStar
implementation in intellectual crisis, but mostly this is me reinventing the wheel for the sake of learning, and hopefully this resulted in usable API that I wouldn't be able to come up if I were to copy-paste AStar
structures blindlessly. 😄
Again, for reference, I took inspiration from https://networkx.org/ API.
Added a way to create custom scripted graph vertices and edges.
extends Node2D
func _ready():
var road = Road.new()
var a = road.add_intersection(Vector2(0, 0))
var b = road.add_intersection(Vector2(100, 100))
var s = road.add_segment(a, b)
for x in road.get_intersections():
print("Intersection: ", x.value)
assert(a is RoadIntersection)
assert(s is RoadSegment)
print(a.has_lights())
print(s.is_slippery())
As you see, you can completely hide the entire Graph
interface if you really need/want to (for readability).
Example project graph_scripted.zip
So now, how will you differentiate multiple edges? Like if edge_a starts at a and ends at b, and another edge let's say edge_b starts at a and ends at b, how will you differentiate both of them? I have introduced segment_id that's generated like this
func get_id(min_vec: Vector3):
if id != -1:
return id
var from_bit_left_shift = 18
var seg_type_left_shift = 6 + from_bit_left_shift
var custom_left_shift = 12 + seg_type_left_shift
var modder_left_shift = 10 + custom_left_shift
id = int(((modder_id << modder_left_shift)
| (custom_id << custom_left_shift)
| (seg_type << seg_type_left_shift)
| (start_position.intersection.get_id(min_vec) << from_bit_left_shift)
| (end_position.intersection.get_id(min_vec))))
Since edges are actual objects here, they are always unique, even if they are linked to the same vertices. You can fetch all edges with get_edge_list()
:
var edges_between_ab: Array = graph.get_edge_list(a, b)
It's also up to you to define whether you allow multiple edges between vertices, even for undirected edges (because they can have different weights, for instance).
The ids are generated internally for the purpose of computing the hash for storing and probably graph data serialization at some point. I think we could expose the internal ID via get_id()
(both for vertices and edges), but I'm not sure whether it's actually needed with current structure.
Looking at your snippet, I figured that there's actually no function for searching vertices by values in my implementation. I'll add find_vertex(value)
.
Updates so far:
Using LocalVector
over List
for improved performance (up to x20 improvement for dense graphs with multiple edges).
If the only reason is because of being able to remove in the middle of iteration cheaply, you can also do that in vectors by swapping with the last element (order is rarely a concern
@Zylann interestingly, I've noticed that LocalVector
has remove_unordered()
, so this was easier to implement. 🙂
Added connectivity methods to find connected components starting from arbitrary root node. This involves depth-first search.
Added a way to pass and use graph iterators (can be created via script by extending GraphIterator
). Currently, GraphIteratorDFS
is the default implementation used internally. For instance, you could create GraphIteratorBFS
for breadth-first search (I'll likely implement BFS as well, though).
var V = graph.iterator # GraphIteratorDFS
V.initialize(root) # Some arbitrary vertex in an existing Graph
while V.has_next():
var n = V.next()
You could use find_connected_component()
for that purpose as well and iterate the returned array instead of course, but if you have to search for specific value/solution in a large graph, that would be overkill (especially when searched value/solution can be found early). So in that case, you'd use a GraphIterator
.
You can also implement the AStar algorithm if possible. Also where can I download this graph version? Found it haha.
btw, could implement a way so we can pass extra info to the constructor of our own classes. Like: I want to pass the network info of the road intersection when creating it, so when I create a vertex using the graph system.
graph.add_vertex(value, bind_args)
and then the value of the bind args could be passed like this.
func _create_vertex(bind_args):
return RoadIntersection.new(bind_args["position"], bind_args["road_info"])
it could also be a array, just pass in whatever you get in the bind args haha.
Add a way to fetch the internal ids too, would be really helpful! Just wanted to ask, is the ID of the vertex based on the position or based on the index of the vertex list? it'd be nice if it were based on the position instead.
Also how do you differentiate the multi-edged graph with ids (not with objects)?
You can also implement the AStar algorithm if possible.
I think this could be added by implementing shortest_path()
method. But since AStar
already solves this, this is low priority, especially when you can use existing shortest_path_tree()
method to solve a variety of path finding problems (see documentation and the snippet there on how to extract any shortest path. What characterizes A is the presence of heuristic function and explicit target vertex (from, to), while Dijkstra works like breadth-first search (bucket-fill), but allows to find all shortest* paths from source to any other vertex.
pass extra info to the constructor of our own classes.
I think we have some misunderstanding here with virtual callback vs signal callback. The point of those virtual callbacks is to override vertex/edge instantiation. Since you're extending Graph
already and have your own class for GraphVertex
. you can add wanted functions (just as an example):
class_name RoadIntersection extends GraphVertex
func _init(p_position, p_road_net):
value = {}
value.road_net = p_road_net
value.position = p_position
But when you call graph.add_vertex(value)
, it will store the value
in that vertex already.
Notice that you can use value
property to initialize anything: Array
, Dictionary
, Vector2
etc. If you do this, you'll be able to automatically serialize the graph.
From what I can tell, it appears to me that you want to have the road_net
(graph?) to be accessible from any other vertex. In that case, I think it makes sense to add GraphVertex::get_graph()
indeed.
Add a way to fetch the internal ids too, would be really helpful!
What for? Can you show where having access to internal IDs would be helpful?
Just wanted to ask, is the ID of the vertex based on the position or based on the index of the vertex list? it'd be nice if it were based on the position instead.
IDs are generated starting from 1
and incremented on every vertex creation. How do you define position
in this context? If you say that position is Vector2()
, then it's a concrete concept. The Graph
class is agnostic to what kind of data you'd like to use (not necessarily geometric meaning), apart from the weights used by methods such as minimum_spanning_tree()
or shortest_path_tree()
which interpret value
of GraphEdge
s as a float
.
Also how do you differentiate the multi-edged graph with ids (not with objects)?
Is comparing objects impossible in your case? I'm asking because current architecture is object-oriented, and mostly everything was designed that way in mind. I'd just like to understand what you struggle with.
See also my previous post: https://github.com/goostengine/goost/pull/172#issuecomment-1026174538
class_name RoadIntersection extends GraphVertex
func _init(p_position, p_road_net):
value = {}
value.road_net = p_road_net
value.position = p_position
how would you pass the p_road_net and p_position?
From what I can tell, it appears to me that you want to have the
road_net
(graph?) to be accessible from any other vertex. In that case, I think it makes sense to addGraphVertex::get_graph()
indeed.
I'm sorry, it was road_info (RoadNetworkInfo (for rendering system and ai stuff)).
Is comparing objects impossible in your case? I'm asking because current architecture is object-oriented, and mostly everything was designed that way in mind. I'd just like to understand what you struggle with. It is possible, but I think intergers could be much faster, yeah, I understand that the current architecture is object-oriented, but I think it would be nice to use IDs to keep it similar to AStar resource.
My Graph actually uses ID based system, and ids are generated based on the position, there's a get_id
method that returns the id for the specific intersection, as you see my old snippet, I also have a segment_id
generator that does some advanced stuff so I can pass the id
to the shader and the shader could get back all the info it needs. But the main goal for the implementation of my segment_id
stuff was to differentiate two segments/edges.
how would you pass the p_road_net and p_position?
Here's another way:
# Add
var intersection = graph.add_vertex({id=10, position = Vector2(x, y), road_net = info})
# Access
var id = intersection.value.id
var position: Vector2 = intersection.value.position
var road_net: Dictionary = intersection.value.road_net
You can bypass all of this and simply use Graph
as a data structure. You can define your own ID system this way if that's how your current system works. The question is whether you really need to use a general-purpose Graph
here, as stated in https://github.com/godotengine/godot-proposals/issues/3848#issuecomment-1019508463. If you do feel like you'd benefit from using a general-purpose implementation, I think it just takes time to adapt your current architecture.
The fact that the internal implementation uses IDs does not mean that you should rely on it or even assume that it exists.
It is possible, but I think intergers could be much faster, yeah, I understand that the current architecture is object-oriented, but I think it would be nice to use IDs to keep it similar to AStar resource.
I think that actually calling get_id()
twice in GDScript would be slower over comparing objects (which are mostly pointers), and the difference is really negligible (if at all), therefore not worth to be optimizing (in exchange for flexibility). See for instance performance research in godotengine/godot#43343.
I thought about using IDs as well initially, but see https://github.com/godotengine/godot-proposals/issues/3848#issuecomment-1020696105. Otherwise I'd be using AStar
.
So again, the priority here is flexibility due to needed generality. Yet it's fast enough.
I think the core implementation is there. I'd like to merge this to let other people to try it out, and let https://github.com/goostengine/goost-fuzzer find the bugs that perhaps I couldn't find myself. There will likely be some compatibility changes.
See example project at https://github.com/goostengine/goost-examples/tree/gd3/core/graph.
Thanks for feedback!
Could you please add ability to fetch all edges a vertex is connected? Something like GraphVertex.get_neighbouring_edges()
@sairam4123 makes sense, but I'll probably need to refactor how get_successors()
and get_predecessors()
work, because those are not that efficient compared to get_neighbors()
. That said, I'd add:
GraphVertex.get_edges()
(for fetching all neighboring edges, I think that neighboring
is redundant here, since having an edge implies having a connected vertex)GraphVertex.get_output_edges()
(for successors vertices), or get_outward_edges()
?GraphVertex.get_input_edges()
(for predecessor vertices), or get_inward_edges()
?But do note that it's possible to do this via script at the moment.
Thanks! I was thinking about it too, it should be really helpful for me!
get_inward_edges()
and get_outward_edges()
seems good but, I might recommend get_incoming_edges()
and get_outgoing_edges()
. Your choice though! 😄
Closes godotengine/godot-proposals#3848.
Intended to implement a mixed graph data structure. Allows to add both associative (undirected) and directed edges, with multiple edges between a pair of vertices, and self-loops.
(annotations are optional)
To-do