Describe the bug
Using GraphFormatConvertor to go from nx to pyg graph results in a doubling of the number of edges indicated by edge_index (this means we are going from undirected nx edges to directed pyg edges in both directions), however not all edge features are similarly doubled.
"kind" in particular, which indicates the bond type, is not doubled (there seems to be some filtering code specifically targetting this feature). One other feature I've tested, bond_length, does correctly double with the edge_index.
I am wondering if this is intended?
It seems to me that this may introduce faults in matching the correct feature to the correct edge since the doubled edges are interleaved in edge_index and not with the duplicate edges at the end of the matrix e.g.:
[[0, 1, 1, 2], and not [[0, 1, 1, 2],
[1, 0, 2, 1]] [1, 2, 0, 1]]
and then the kind feature tensor would be pointing to a different edge since it is assuming the same order of edges as pre-conversion? Though there may be some matching methods I'm unaware of to deal with this later?
To Reproduce
I'm constructing a protein graph using the following ProteinGraphConfig:
And then converting it using
convertor = GraphFormatConvertor(src_format="nx", dst_format="pyg", verbose="all_info", columns=self.columns)
And checking the numbers:
Expected behavior
Either all features double to match the edge_index or the edge_index (and other features) doesn't double and it is up to the user to add directed edges in the other direction (or some options for the converter to customise this).
Alternatively leave the (not)doubling as is and instead add the reverse direction edges at the end of the edge_index tensor so we can simply apply the 'kind'-tensor to both halves.
Desktop (please complete the following information):
OS:
Ubuntu
Python Version
3.11.8
Graphein Version [e.g. 22] & how it was installed
1.7.6 installed using pip install graphein[all]
Describe the bug Using GraphFormatConvertor to go from nx to pyg graph results in a doubling of the number of edges indicated by edge_index (this means we are going from undirected nx edges to directed pyg edges in both directions), however not all edge features are similarly doubled. "kind" in particular, which indicates the bond type, is not doubled (there seems to be some filtering code specifically targetting this feature). One other feature I've tested, bond_length, does correctly double with the edge_index. I am wondering if this is intended? It seems to me that this may introduce faults in matching the correct feature to the correct edge since the doubled edges are interleaved in edge_index and not with the duplicate edges at the end of the matrix e.g.: [[0, 1, 1, 2], and not [[0, 1, 1, 2],
[1, 0, 2, 1]] [1, 2, 0, 1]] and then the kind feature tensor would be pointing to a different edge since it is assuming the same order of edges as pre-conversion? Though there may be some matching methods I'm unaware of to deal with this later?
To Reproduce I'm constructing a protein graph using the following ProteinGraphConfig:
And then converting it using
convertor = GraphFormatConvertor(src_format="nx", dst_format="pyg", verbose="all_info", columns=self.columns)
And checking the numbers:Expected behavior Either all features double to match the edge_index or the edge_index (and other features) doesn't double and it is up to the user to add directed edges in the other direction (or some options for the converter to customise this). Alternatively leave the (not)doubling as is and instead add the reverse direction edges at the end of the edge_index tensor so we can simply apply the 'kind'-tensor to both halves.
Desktop (please complete the following information):