openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
301 stars 88 forks source link

Allow Molecule.from_file to read only a subset of molecules/conformers in a file. #282

Open andrrizzi opened 5 years ago

andrrizzi commented 5 years ago

For files containing a lot of molecules (e.g. data/molecules/MiniDrugBank.sdf) this should help cutting the time necessary to run the toolkit.

jchodera commented 5 years ago

What would the API look like for extending Molecule.from_file()?

Perhaps

# Read only the first molecule
one_molecule = Molecule.from_file('data/molecules/MiniDrugBank.sdf', molecule_indices=7)
# Read specific molecules
molecules_from_indices = Molecule.from_file('data/molecules/MiniDrugBank.sdf', molecule_indices=[0,3,17])
# Read a range of molecules
molecules_from_range = Molecule.from_file('data/molecules/MiniDrugBank.sdf', molecule_indices=range(10,20))
# Read all molecules
all_molecules = Molecule.from_file('data/molecules/MiniDrugBank.sdf')
andrrizzi commented 5 years ago

That's exactly what I had in mind.

mattwthompson commented 1 year ago

Related https://github.com/openforcefield/openff-toolkit/issues/282