MolSSI / QCFractal

A distributed compute and database platform for quantum chemistry.
https://molssi.github.io/QCFractal/
BSD 3-Clause "New" or "Revised" License
143 stars 47 forks source link

Improve compile_values and related functions #765

Closed janash closed 9 months ago

janash commented 9 months ago

This PR makes some edits to the compile_values method and adds a new dataset method called get_properties_df.

The following changes are made to compile_values:

Example use:

import qcportal as ptl

client = ptl.PortalClient("https://qcademo.molssi.org")
dataset = client.get_dataset_by_id(1)

# Case 1 - Single value returned from callable.
df1 = dataset.compile_values(lambda x: x.properties["scf_total_energy"],"scf_total_energy" )

# Case 2 - Tuple of two values returned from callable with unpack True
df2 = dataset.compile_values(lambda x: (x.properties["scf_total_energy"], x.properties["scf_iterations"]), unpack=True)

The function get_properties_df is added to allow easier compilation of record properties into a dataframe. The function takes a list of properties and returns a multi-level index dataframe similar to the type returned from compile_values. Under the hood, the function uses compile_values accessing the properties using .get in case property doesn't exist for a particular specification. Additionally, before the df is returned to the user any columns containing all nan are dropped.

Example use (continuing from example above)

properties = dataset.get_properties_df(["scf_total_energy", "scf dipole", "mp2 dipole"])
codecov[bot] commented 9 months ago

Codecov Report

Merging #765 (0325110) into main (23c4103) will decrease coverage by 0.08%. Report is 22 commits behind head on main. The diff coverage is 5.00%.

:exclamation: Current head 0325110 differs from pull request most recent head fee87f3. Consider uploading reports for the commit fee87f3 to get more accurate results

Additional details and impacted files
bennybp commented 9 months ago

Looks good! Thanks!