CDCgov / data-exchange-hl7

Enterprise Data Exchange (DEX) is a new cloud-native centralized data ingestion, validation, and observation service scoped for common data types (HL7, FHIR, CDA, XML, CSV) sent to the CDC. It helps public health stakeholders who send data to the CDC while reducing the maintenance efforts, complexity, and duplication of ingestion points to CDC.
Apache License 2.0
10 stars 14 forks source link

Business requirements for provenance and routing #93

Open rmharrison opened 2 years ago

rmharrison commented 2 years ago

Business Requirements

  1. Track who is sending what data stream to us (state vs. data type/source and any other additional useful info)
  2. Track where that data stream is landing and what validation line it went through - HL7 case data going to these 16 locations, based on condition code/MMG.
  3. Routing information/config - if we are doing push paradigm (like discussed today) then we need all the details/info on how we are pushing/routing that info to the program. How do we confirm delivery receipt?
rmharrison commented 2 years ago

Two concepts are required to solve for these requirements. A) The metadata B) Routing

Metadata

Current state: For Arbo, we record very limited metadata about the processing.

Considerations for future state There were be some portion "registered" in the EDC (data catalog). The rest will have to be in our metadata.

There is a wussy line between EDCs (Enterprise Data Catalogs) and MDMs (MetaData Management Solutions). Decision: EDC vs DEX Metadata Decision: Within DEX Metadata, when to migrate FROM storage in self-managed tables (e.g., DeltaLake tables) TO a MDM solution

Routing

Current state: The data goes to an EDAV-hosted DEX Storage Account.

Considerations for future state Avoid complex routing as long as we can by using the structure of our storage account. For example... -EDAV DEX Storage Account --HL7 ---Case ----Arbo