Does the graph transformer take graph connectivity into account?

cvignac / DiGress

code for the paper "DiGress: Discrete Denoising diffusion for graph generation"

MIT License

349 stars 73 forks source link

Does the graph transformer take graph connectivity into account? #49

Closed nihalsid closed 1 year ago

nihalsid commented 1 year ago

Hi, thanks for the great work!

Looking at NodeEdgeBlock layer it seems that the attention is applied assuming full connectivity in the graph (except for invalid nodes masked out by node_mask). Is this the case? If so, is there a reason why you didnt go with the graph attention layer implementation here that takes into account the connectivity?

Thanks! Yawar

cvignac commented 1 year ago

DiGress makes predictions for all pairs of nodes, so it makes sense to use all pairs of nodes to perform the message-passing as well. For small molecules, considering all pairs is very common, and actually yields the best performance.

If you work with larger graphs, you might want to replace the transformer layer in order to reduce the GPU use, but I don't know how it will affect performance.

nihalsid commented 1 year ago

Thanks for the quick response!