openforcefield / openff-qcsubmit

Automated tools for submitting molecules to QCFractal
https://openff-qcsubmit.readthedocs.io/en/latest/index.html
MIT License
26 stars 4 forks source link

Add initial result filter API #109

Closed SimonBoothroyd closed 3 years ago

SimonBoothroyd commented 3 years ago

Description

This PR exposes the initial API for filtering the results collections introduced in #106. It allows collections to be filtered directly:

filtered_collection = basic_result_collection.filter(
    # Exclude specific SMILES from the set, e.g. a training set.
    SMILESFilter(smiles_to_exclude=[...]),
    # Retain only records computed for molecules which match a set of 
    # SMARTS, e.g. a set of SMARTS associated with parameters. that will be trained.
    SMARTSFilter(smarts_to_include=[...], 
)

where each filter is sequentially applied or alternatively individual filters can be applied:

filter = SMILESFilter(smiles_to_include=[...])
filtered_collection = filter.apply(basic_result_collection)

Provenance about the applied filters will be stored in a new result collection provenance field:

>>> filtered_collection.provenance["applied-filters"]

{
    'SMILESFilter-0': {'smiles_to_exclude': [...]},
    'SMARTSFilter-1': {'smarts_to_include': [...]},
}

Status

codecov[bot] commented 3 years ago

Codecov Report

Merging #109 (c884d67) into master (a873491) will decrease coverage by 0.05%. The diff coverage is 89.65%.