SPIKE: consider async query execution for athena

smart-on-fhir / cumulus-library

https://docs.smarthealthit.org/cumulus/library/

Apache License 2.0

2 stars 0 forks source link

SPIKE: consider async query execution for athena #217

Closed dogversioning closed 1 week ago

dogversioning commented 2 months ago

pyathena provides a non-PEP async cursor that, if used correctly, could greatly accelerate runtime - it would be A Project to get it in, but would allow us to do things like queue several denormalized tables to run at once.

Some things to consider before going down this path:

How do we mark queries as parallelizable vs dependent?
How does this impact other databases?
How big of an overhaul to the existing codebase would this be?

dogversioning commented 1 week ago

This is probably going to introduce too much low level logic to studies - closing as won't do