CDCgov / data-exchange-hl7

Enterprise Data Exchange (DEX) is a new cloud-native centralized data ingestion, validation, and observation service scoped for common data types (HL7, FHIR, CDA, XML, CSV) sent to the CDC. It helps public health stakeholders who send data to the CDC while reducing the maintenance efforts, complexity, and duplication of ingestion points to CDC.
Apache License 2.0
10 stars 14 forks source link

Implement the HL7 data feed from PHIN MS into DEX #499

Closed lmcnabb closed 1 year ago

lmcnabb commented 1 year ago

@ssk2cdcgov we need to bring all the meta data from Phinms. The payloads that will be pulled from the inqueue table should be processed in DEX, in order in which the message was received.

ssk2cdcgov commented 1 year ago

I able to connect to Databricks. I have also had someone explain how to decrypt the PHINMS message from the 'DEX_inq' table. I have applied for the 'Service Account' to access this table and the encryption process. Contacted Boris and he will arrange access to the "rest of DEX's' resources. I have access to "tfedemessagestoragedev" but cant see the 'Storage Browser'.

ssk2cdcgov commented 1 year ago

I have the basic operation modeled in Databricks using a table from PHLIP. This Databrick pulls rows from a table and persists them as a flat file. I have a method to track the rows copied, modeled after PHLIP basically. I will be finishing up the initial mechanism for tracking the rows copied today. I have not heard from Boris yet, but will check with him today. I am waiting for access to On-Premise SQL Server via Databricks. I am also waiting for Access to the "Storage Browser", i.e. to access the Blob storage. I also don't think I have access to Databricks, but am using PHLIP Dev to model this.

ssk2cdcgov commented 1 year ago

I spoke emailed John Stevens about the Service Account and it should be available soon.

ssk2cdcgov commented 1 year ago

The ADF environment is now working for 'edav-prd-dex-factory '. The blob storage, 'DEX_STG_ADLS_LS', is working and I have access to it. The ADF environment needs to have permissions added, but this has been communicated.

ssk2cdcgov commented 1 year ago

I have begun the process of moving over the initial efforts from the other ADF environment. I have also had a discussion with Marcelo regarding the final format the records will be added to the storage, including a method to name the file for each record pulled. There will be two parts to the pull, one will be to pull the HL7 record and persist it to a file. The other will be to pull the 'Metadata' around each record and persist this as well.

ssk2cdcgov commented 1 year ago

There was a right's issue to the Blob storage, which was fixed on Friday. There is still one last issue, I cannot save an ADF. But I can work for now in debug mode.

ssk2cdcgov commented 1 year ago

The rights issue was worked out yesterday afternoon. I can now persist the data to files and see them in the final 'landing space' for the blob storage. I am stil waiting for final permissions to place the decrypted files in the blob storage..

mscaldas2012 commented 1 year ago

Holding for ATO.

ssk2cdcgov commented 1 year ago

The path for the data is now down to 'one step' as opposed to the 'two steps' we had earlier. I have rewritten the ADF I have to accommadate this and am waiting for a meeting today to give a final review for the encryption.

ssk2cdcgov commented 1 year ago

Made a first pass at documenting the process and, cleaned up and renamed the subordinate processes of the ADF. As well as restructuring some parts for the Prod version.

jmann2817 commented 1 year ago

done not closed.

ssk2cdcgov commented 1 year ago

This should be done and running. There are other modifications such as moving part of the process to Databricks. These are tracked in other tickets.