neo4j / graph-data-science

Source code for the Neo4j Graph Data Science library of graph algorithms.
https://neo4j.com/docs/graph-data-science/current/
Other
621 stars 160 forks source link

Can Message object be more than just doubles #162

Closed adamggibbs closed 2 years ago

adamggibbs commented 2 years ago

I'm trying to implement a graph algorithm using the Pregel API, but it requires sending objects between nodes rather than just primitive data types. I see in the ComputeStep.java class that messages are only sent as doubles or types casted as doubles. Could there be functionality added to send more types as messages?

Also, if I'm incorrect and this functionality already exists please let me know. Thanks!

s1ck commented 2 years ago

Your observation is correct, messages are limited to primitive double. This is mainly to avoid the overhead added by using objects / boxed types. We thought about supporting generic messages, but haven't had requests for it yet. We'll discuss the topic when we plan upcoming releases.

That being said, you might work around the problem by dividing your pregel computation into phases, where each phase sends a part of the message. This is typically done in computations where the computation changes depending on the phase you are in (a phase is basically a superstep mod no_of_phases. Also, the node state can be more than one value (the computation schema), which allows you to store intermediate message states at a node. I'm aware that this would result in more iterations, but if it helps getting the job done for now it might be worth trying.

Another option is to use the double's bits to encode an arbitrary message, e.g. by writing 2 ints into it.

rionda commented 2 years ago

+1 on this feature.

Even just understanding what is expected from the types of the messages (e.g., what interface(s) they must implement) would be good. I assume that in some way this would boil down to the Writable types used in, e.g., Hadoop.

s1ck commented 2 years ago

Maybe, except that we don't necessarily require serializing the messages the same way as Hadoop / Giraph / etc. do it.

s1ck commented 2 years ago

We will consider this feature in one of the upcoming releases. I will close the card now and we'll track this internally. Thanks for the request @adamggibbs and @rionda

rionda commented 2 years ago

Thank @s1ck for considering it!

laeg commented 1 year ago

@rionda @adamggibbs what types are you after and how are you planning to use them?

rionda commented 1 year ago

@laeg It'll take me a few days to get back to this with a clear answer, but I think it was essentially lists of edges, or something like that.