job patterns for partitioning lists and mapping onto them

zulissimeta commented 1 week ago

Summary of Changes

This draft PR adds job patterns that are common in high-throughput workflows. When running many jobs on the same flow (say result = map(my_job, range(1000))) , many workflow managers will get sad (network/db/load issues) with too many jobs. This PR is inspired by the dask bag partitions and map operator.

For example:

@job
def testjob(**kwargs):
    print(kwargs)

@flow
def testflow():
    num_partitions = 2
    result = map_partitioned_lists(
        testjob,
        num_partitions=num_partitions,
        test_arg_1=partition([1,2,3,4,5], num_partitions),
        test_arg_2=partition(["a", "b", "c","d","e"], num_partitions),
    )

testflow()

should yield:

{'test_arg_1': 1, 'test_arg_2': 'a'}
{'test_arg_1': 2, 'test_arg_2': 'b'}
{'test_arg_1': 3, 'test_arg_2': 'c'}
{'test_arg_1': 4, 'test_arg_2': 'd'}
{'test_arg_1': 5, 'test_arg_2': 'e'}

but run in only two jobs (instead of 5).

In addition, by using a specified number of partitions, logic flows can quickly move from one step to the next without waiting for any intermediate results.

many_results = generate_many_objects()

# Cannot continue until many_results is generated as the number of jobs to be generated is unknown
final_results = [analysis_job(result) for result in results]

while

many_results = generate_many_objects()
partitions_results = partition(many_results, num_partitions=10)
final_results = mapped_partitioned_lists(analysis_job, result=partitioned_results, num_partitions=10)

is fine since it is clear there is a first job, a second partition job that yields 10 tasks, and a final mapping job that has 10 jobs (one per partition).

buildbot-princeton commented 1 week ago

Can one of the admins verify this patch?

codecov[bot] commented 1 week ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 99.07%. Comparing base (6a5d2c8) to head (31e2a26).

:exclamation: Current head 31e2a26 differs from pull request most recent head 4cbae69

Please upload reports for the commit 4cbae69 to get more accurate results.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #2297 +/- ## ======================================= Coverage 99.07% 99.07% ======================================= Files 82 83 +1 Lines 3445 3464 +19 ======================================= + Hits 3413 3432 +19 Misses 32 32 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

zulissimeta commented 1 week ago

Thanks, @zulissimeta! This is an interesting PR.

If I understand correctly, it seems like this is logic that would need to be put within the @flows themselves, right? So, the ideal use case here is someone is importing a pre-made @job (e.g. quacc.recipes.mlp.core import relax_job) and making a custom @flow for themselves from that, right? It would perhaps be a bit difficult to justify when/where to add this logic within pre-made @flows in quacc itself (i.e. in quacc.recipes).

Some (non-substantial) comments below while I await your reply.

Yes, exactly! If you want a flow that does many jobs in parallel (for example, if you made a bulk_to_adsorbates_flow that generated hundreds of possible adsorbate configuration and had ML potentials to make relaxations fast), this would be helpful.

Partitioning and batching would also be helpful if doing lots of inference; then you could partition a list, and apply a function to take batches and do ML inference quickly, rather than running one MLP relaxation as a separate job.

Andrew-S-Rosen commented 1 week ago

Got it, thanks! Since it is pretty independent from existing recipe logic, I don't have much reservation about this. We will just want to add a brief section to the documentation somewhere highlighting how it can be used since it's somewhat of an "advanced" (but useful!) feature. I am happy to take care of the docs though.

Andrew-S-Rosen commented 4 days ago

Thanks! Setting this to auto-merge now.

Quantum-Accelerators / quacc

job patterns for partitioning lists and mapping onto them #2297

Summary of Changes

Codecov Report