datamol-io / datamol

Molecular Processing Made Easy.
https://docs.datamol.io
Apache License 2.0
462 stars 48 forks source link

New align + news descriptors + some cleaning #93

Closed hadim closed 2 years ago

hadim commented 2 years ago

Checklist:


I still need to add the new align code.

codecov[bot] commented 2 years ago

Codecov Report

Merging #93 (88fd48d) into main (85e36ba) will increase coverage by 1.21%. The diff coverage is 98.46%.

@@            Coverage Diff             @@
##             main      #93      +/-   ##
==========================================
+ Coverage   82.49%   83.71%   +1.21%     
==========================================
  Files          46       48       +2     
  Lines        3176     3420     +244     
==========================================
+ Hits         2620     2863     +243     
- Misses        556      557       +1     
Flag Coverage Δ
unittests 83.71% <98.46%> (+1.21%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
datamol/fp.py 84.50% <ø> (ø)
datamol/utils/jobs.py 100.00% <ø> (ø)
datamol/viz/utils.py 80.00% <ø> (-6.67%) :arrow_down:
datamol/io.py 88.88% <75.00%> (+0.13%) :arrow_up:
datamol/data.py 92.59% <90.90%> (-7.41%) :arrow_down:
datamol/descriptors/descriptors.py 96.77% <91.30%> (-3.23%) :arrow_down:
datamol/convert.py 92.65% <94.11%> (+0.08%) :arrow_up:
datamol/__init__.py 100.00% <100.00%> (ø)
datamol/_version.py 100.00% <100.00%> (ø)
datamol/align.py 100.00% <100.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 37a26e0...88fd48d. Read the comment docs.

hadim commented 2 years ago

It seems like a long PR but most of it is about adding Optional[] and types cleaning. New things are essentially the descriptors and the new align module + tests.

hadim commented 2 years ago

@MichelML putting you in review FYI. This PR contains code for aligning molecules. The API is essentially the same as Manu's gist. Feel free to use it kernel side.

hadim commented 2 years ago

In short you can simply do:

import datamol as dm

# get a list molecules
mols = dm.cdk2(as_df=False)

# align them if needed
if dm.align.should_align(mols):
    aligned_mols = dm.align.auto_align_many(mols)
MichelML commented 2 years ago

Very nice, once this is release I'll replace my copy-paste of Manu's code too, currently working on alignment on upload

(I like the API)

hadim commented 2 years ago

Thanks!