a-r-j / graphein

Protein Graph Library
https://graphein.ai/
MIT License
998 stars 125 forks source link

`PDBManager` - Bug fixes, adding necessary changes to export only first PDB model, and merging-in latest updates from `master` #309

Closed amorehead closed 1 year ago

amorehead commented 1 year ago

Reference Issues/PRs

What does this implement/fix? Explain your changes

What testing did you do to verify the changes in this PR?

Pull Request Checklist

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

amorehead commented 1 year ago

@a-r-j, I noticed that our (new) default function for exporting collated groups of PDB chains does not currently export only the first model for each PDB complex file. I've added the necessary changes to make this happen, and I also added a workaround that's needed when exporting (i.e., calling to_pdb() on) PandasPdb objects that have had a model_id column added to them. In the long term, I think it'd be good to have a fix merged into the master branch of BioPandas that types the model_id column as a str -> object column, but this workaround I've proposed should work for now.

amorehead commented 1 year ago

@a-r-j, note that this PR also includes a fix for dataset-splitting edge cases where you only select one splitting strategy (e.g., time-based splitting) and then try to download and export your selected PDBs. Currently, if you try to export such PDBs, the download and export functions will throw an error saying that your selected PDBs are not labeled correctly according to which split to which they are assigned. The fix for this is simple: track split names per PDB entry more thoroughly by installing them within individual splitting functions (e.g., split_df_into_time_frames). If you are good with these changes, please feel free to go ahead and merge these changes into master after testing them on your end.

codecov-commenter commented 1 year ago

Codecov Report

:exclamation: No coverage uploaded for pull request base (pdb_manager@1d2bb0b). Click here to learn what that means. Patch has no changes to coverable lines.

:mega: This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files ```diff @@ Coverage Diff @@ ## pdb_manager #309 +/- ## ============================================== Coverage ? 43.98% ============================================== Files ? 113 Lines ? 7773 Branches ? 0 ============================================== Hits ? 3419 Misses ? 4354 Partials ? 0 ``` Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Arian+Jamasb)

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

sonarcloud[bot] commented 1 year ago

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information