a-r-j / graphein

Protein Graph Library
https://graphein.ai/
MIT License
1.02k stars 129 forks source link

Fix bug in `add_k_nn_edges` and add minor extension #229

Closed anton-bushuiev closed 1 year ago

anton-bushuiev commented 1 year ago

Reference Issues/PRs

No related issues/PRs

What does this implement/fix? Explain your changes

This pull request fixes a bug in add_k_nn_edges. Currenly, kneighbors_graph(X=dist_mat, ...) leads to wrong results which may seem correct. According to scikit-learn docs, it does not support X to be a distance matrix.

Secondly, this PR adds filter_distmat to generalise self-loops filtering. It adds the functionality to filter inter- or intra- connections between nodes of a single or multiple chains. It may be useful to incorporate it into other functions to add edges. By running the following code you will get the visualized graphs.

for val in ['inter', 'intra']:
    edge_funcs = {
        'edge_construction_functions': [
            partial(add_k_nn_edges, k=2, long_interaction_threshold=0, exclude_edges=[val])
        ]
    }
    config = ProteinGraphConfig(**edge_funcs)
    g = construct_graph(config=config, pdb_code='10gs')
...

image image

What testing did you do to verify the changes in this PR?

Pull Request Checklist

a-r-j commented 1 year ago

Thanks for this @anton-bushuiev. Good spot!

LGTM! Would you be able to add a couple tests?

anton-bushuiev commented 1 year ago

Yeah, I'll add them tomorrow.

anton-bushuiev commented 1 year ago

Hi, @a-r-j ! I've added the tests. I had to fix a new bug introduced by the distance matrix filtering (see this) 😄.

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

codecov-commenter commented 1 year ago

Codecov Report

Base: 40.27% // Head: 47.86% // Increases project coverage by +7.59% :tada:

Coverage data is based on head (fd1b36b) compared to base (8123f42). Patch coverage: 52.20% of modified lines in pull request are covered.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #229 +/- ## ========================================== + Coverage 40.27% 47.86% +7.59% ========================================== Files 48 85 +37 Lines 2811 5398 +2587 ========================================== + Hits 1132 2584 +1452 - Misses 1679 2814 +1135 ``` | [Impacted Files](https://codecov.io/gh/a-r-j/graphein/pull/229?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb) | Coverage Δ | | |---|---|---| | [graphein/ml/diffusion.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vbWwvZGlmZnVzaW9uLnB5) | `0.00% <0.00%> (ø)` | | | [graphein/ppi/graph\_metadata.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcHBpL2dyYXBoX21ldGFkYXRhLnB5) | `0.00% <0.00%> (ø)` | | | [graphein/ppi/visualisation.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcHBpL3Zpc3VhbGlzYXRpb24ucHk=) | `0.00% <0.00%> (ø)` | | | [graphein/protein/analysis.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcHJvdGVpbi9hbmFseXNpcy5weQ==) | `0.00% <0.00%> (ø)` | | | [graphein/protein/features/sequence/utils.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcHJvdGVpbi9mZWF0dXJlcy9zZXF1ZW5jZS91dGlscy5weQ==) | `28.00% <0.00%> (+3.00%)` | :arrow_up: | | [graphein/protein/features/utils.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcHJvdGVpbi9mZWF0dXJlcy91dGlscy5weQ==) | `27.77% <0.00%> (ø)` | | | [graphein/rna/utils.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcm5hL3V0aWxzLnB5) | `38.46% <ø> (ø)` | | | [graphein/rna/visualisation.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vcm5hL3Zpc3VhbGlzYXRpb24ucHk=) | `28.57% <ø> (+28.57%)` | :arrow_up: | | [graphein/utils/config.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vdXRpbHMvY29uZmlnLnB5) | `100.00% <ø> (+100.00%)` | :arrow_up: | | [graphein/utils/config\_parser.py](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb#diff-Z3JhcGhlaW4vdXRpbHMvY29uZmlnX3BhcnNlci5weQ==) | `100.00% <ø> (ø)` | | | ... and [81 more](https://codecov.io/gh/a-r-j/graphein/pull/229/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb) | | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

a-r-j commented 1 year ago

Thanks for the contribution @anton-bushuiev!