Closed emmastephenson closed 1 year ago
Sprint Goals
Update 4/14:
Auth between Synapse and storage as well as Synapse and Postgres MPI has be solved.
Work to get code for seeding MPI with LAC extract running in Synapse is happening now.
Sprint goal for next sprint: Everything other than creating the TSVs that go to LAC are complete. Updates, joins on eCR data store/MPI/MCI are complete, so all the data is available.
The sprint after that - going from that data to LAC's custom TSV spec.
Mid-sprint checkin: All Synapse jobs are working; just one clarification question on dates. Next step is to make sure the Synapse jobs are run automatically. Dummy delta lake is merged.
End of next sprint: All the Synapse jobs are running and working as intended.
Marcelle, Brandon, Robert, Nick C, Dan, and Kenneth will be working on this effort. (Marcelle, Robert, Kenneth, and probably Nick C for only half the sprint)
Work to create TSV extracts for IRIS is in progress.
Work to filter by COVID labs is also in progress.
Goal for final sprint:
We think it's done!
Why are we doing this work?
For the LAC pilot to be successful, we need to read LAC patient and case data, and write extracts of eCR data that has passed through our pipeline. This is best accomplished through Azure Synapse, as it will allow us to read and transform data on a regular, scheduled basis using SQL/Python notebooks.
These tasks are likely to require more compute/memory resources than are available in our Azure Container Apps, hence the need to use Synapse. (Synapse is roughly the Azure equivalent of Databricks).
Background and strategic fit
This work is necessary for:
How does the user interact with this service?
Post-pilot, LAC systems engineers will interact directly with Azure Synapse to make any required tweaks or adjustments.
Acceptance Criteria (Requirements)
Once this epic is completed, the final step of our pipeline will be complete - the MPI and MCI are being seeded from IRIS, and tsv extracts are being sent from our data stores to populate IRIS eCR forms.
Solution Design Doc/Implementation Plan
Link: https://drive.google.com/file/d/1bl5OkgZz-XyeRgN2A7SEAnJ5nIPobTvQ/view?usp=sharing