Closed manonreau closed 1 year ago
SonarCloud Quality Gate failed.
0 Bugs
0 Vulnerabilities
0 Security Hotspots
1 Code Smell
No Coverage information
0.0% Duplication
Thanks for the PR @manonreau!! I'll check this out tomorrow.
Do you think you'd be able to add an appropriate unit test?
Hi @manonreau could you provide the code for g = format_node_edge_features(g)
so I can write a test & get this merged in? Thanks!!
Changes added in #220
Hi @manonreau could you provide the code for
g = format_node_edge_features(g)
so I can write a test & get this merged in? Thanks!!
Hi @a-r-j, Thank you very much for considering my PRs. I just removed the g = format_node_edge_features(g)
since it was just a function to add node level descriptors. I does not change anything to the structure of the graph object.
You should be able to write a test now.
@manonreau I see. Would you be willing to share it anyway? It could be useful :)
And thanks for the contributions!!
Sure, here it is:
def onehot(idx, size):
"""One hot encoder
"""
onehot = torch.zeros(size)
# Fill the one-hot encoded sequence with 1 at the corresponding idx
onehot[idx] = 1
return np.array(onehot)
def format_node_edge_features(g):
"""Format the nodes and edges features computed with Graphein
Args:
g (object): graph
Returns:
object: updated graph
"""
# one hot encoding
residue_names = {'CYS': 0, 'HIS': 1, 'ASN': 2, 'GLN': 3, 'SER': 4, 'THR': 5, 'TYR': 6, 'TRP': 7,
'ALA': 8, 'PHE': 9, 'GLY': 10, 'ILE': 11, 'VAL': 12, 'MET': 13, 'PRO': 14, 'LEU': 15,
'GLU': 16, 'ASP': 17, 'LYS': 18, 'ARG': 19}
edge_type_encoding = {
'peptide_bond': 0, 'aromatic': 1, 'disulfide': 2, 'ionic': 3,
'aromatic_sulphur': 4, 'cation_pi' : 5, 'distance_threshold' : 6, 'hbond' : 7}
# convert node information
resname_onehot = []
for res in g.residue_name :
# One hot encoding of the residue name
resname_onehot.append(onehot(residue_names[res], len (residue_names)))
g["residue"] = resname_onehot
edge_onehot = []
for res in g.kind :
# One hot encoding of the edge type
edge_onehot.append(onehot([edge_type_encoding[x] for x in res], len (edge_type_encoding)))
g["edge_attr"] = edge_onehot
return g
I later noticed that the onehot encoding is already provided by Graphein :)
Reference Issues/PRs
Fixes #217
What does this implement/fix? Explain your changes
The edge features are now given as a list of lists instead of a list of string during the networkx object to pyg object conversion
What testing did you do to verify the changes in this PR?
Pull Request Checklist
./CHANGELOG.md
file (if applicable)./graphein/tests/*
directories (if applicable)./notebooks/
(if applicable)python -m py.test tests/
and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g.,python -m py.test tests/protein/test_graphs.py
)black .
andisort .