AstraZeneca / chemicalx

A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
https://chemicalx.readthedocs.io
Apache License 2.0
708 stars 87 forks source link

Add unified training and evaluation pipeline #32

Closed cthoyt closed 2 years ago

cthoyt commented 2 years ago

Summary

This pull request combines the examples from DeepSynergy and EPGCNDS into a unified training and evaluation pipeline.

The differences between these pipelines (besides superficial hyperparameters of the model, optimizer, and batching):

The new chemicalx.pipeline function takes care of all of the similar code and adds a flexible way for specifying the differences.

Changes

Potential future changes:

cthoyt commented 2 years ago

@benedekrozemberczki would you please test the two examples still work as expected?

codecov-commenter commented 2 years ago

Codecov Report

Merging #32 (ae2bede) into main (b4e7c52) will decrease coverage by 0.82%. The diff coverage is 93.06%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
- Coverage   99.48%   98.65%   -0.83%     
==========================================
  Files          27       28       +1     
  Lines         578      671      +93     
==========================================
+ Hits          575      662      +87     
- Misses          3        9       +6     
Impacted Files Coverage Δ
chemicalx/pipeline.py 89.70% <89.70%> (ø)
chemicalx/__init__.py 100.00% <100.00%> (ø)
chemicalx/data/__init__.py 100.00% <100.00%> (ø)
chemicalx/data/datasetloader.py 98.03% <100.00%> (+1.96%) :arrow_up:
chemicalx/models/__init__.py 100.00% <100.00%> (ø)
chemicalx/models/base.py 100.00% <100.00%> (ø)
chemicalx/models/deepsynergy.py 100.00% <100.00%> (ø)
chemicalx/models/epgcnds.py 100.00% <100.00%> (ø)
tests/unit/test_models.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b4e7c52...ae2bede. Read the comment docs.

cthoyt commented 2 years ago

~I need to better understand how the different settings in the batch generator are chosen. Perhaps it would be better to better unify the classes to take in a full instance of the DrugPairBatch instead of differentially unpacking it in different classes - this might become unsustainable in the long-term~

This was addressed in 2dbc873

benedekrozemberczki commented 2 years ago

Thank you @cthoyt!