google / fhir-data-pipes

A collection of tools for extracting FHIR resources and analytics services on top of that data.
https://google.github.io/fhir-data-pipes/
Apache License 2.0
142 stars 82 forks source link

Added FHIR Sink Config Properties to the Pipeline Controller #947

Closed mozzy11 closed 1 month ago

mozzy11 commented 5 months ago

Description of what I changed

Fixes https://github.com/google/fhir-data-pipes/issues/825

Added Fhir Sink Config Properties to the Pipeline Controller to enable Synching data from a FHIR server to another FHIR server

  fhirSinkPath: ""
  sinkUserName: "hapi"
  sinkPassword: "hapi123"

E2E test

Adede e2e tests for synching from a hapi fhir sever to another using the pipeline controller for both FULL and INCREMENTAL modes

TESTED:

Successfully fetched data form OpenMRS and a HAPI FHIR server to another HAPI FHIR server for both batch and incremental pipeleines

Checklist: I completed these to help reviewers :)

mozzy11 commented 5 months ago

@bashir2 , Requesting for your review here. Fixed the e2e tests

mozzy11 commented 5 months ago

hello @chandrashekar-s , After pulling in your changes from https://github.com/google/fhir-data-pipes/pull/935 Am now getting an error when running the e2e tests "Run E2E Test for Dockerized Controller and Spark Thriftserver": java.io.IOException: Could not read footer: java.lang.RuntimeException: file:/workspace/e2e-tests/controller-spark/dwh/controller_DWH_TIMESTAMP_2024_02_13T19_50_30_899562839Z/Encounter/ConvertResourceFn_Encounter_output-parquet-th-517-ts-1707853844285-r-760691.parquet is not a Parquet file (too small length: 4) .

All I do in this PR is to sync data to both a FHIR sink and Parquet dwh at the same time. The e2e tests initially ran fine before pulling in your changes for fixing High memory usage

chandrashekar-s commented 5 months ago

Hi @mozzy11, usually any errors that occur in the pipeline run are not printed in the logs since the docker is ran in the detached mode. You can use this code which was recently added by @bashir2 to debug the errors, I haven't tried it myself yet though.

mozzy11 commented 5 months ago

@chandrashekar-s increasing the rowGroupSizeForParquetFiles to 100mb seems to have solved my error

chandrashekar-s commented 5 months ago

@mozzy11 Thanks for the update. Currently, the build fails sometimes due to some issues which are mentioned in this ticket. So, the build fails sometimes and retriggering usually works.

Regarding the parameter rowGroupSizeForParquetFiles increasing to 100mb, there is a follow up PR which needs to be merged for it to fully take effect. So, this parameter should not affect the build.

Do you mind undoing the changes for this parameter and retriggering a build.

mozzy11 commented 5 months ago

@chandrashekar-s , youre right , The build passes even with reverting the rowGroupSizeForParquetFilesproperty

mozzy11 commented 5 months ago

@bashir2 do you have any comments for this PR ??

bashir2 commented 5 months ago

@bashir2 do you have any comments for this PR ??

Yeah I was just reviewing this (after the e2e-tests passed); sorry that debugging our e2e-test failures is not trivial.

codecov-commenter commented 4 months ago

Codecov Report

Attention: Patch coverage is 42.85714% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 50.49%. Comparing base (1b7199b) to head (68fb9ec).

Files Patch % Lines
...java/com/google/fhir/analytics/DataProperties.java 42.85% 3 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #947 +/- ## ============================================ - Coverage 50.56% 50.49% -0.07% + Complexity 674 673 -1 ============================================ Files 91 91 Lines 5512 5519 +7 Branches 708 709 +1 ============================================ Hits 2787 2787 - Misses 2464 2470 +6 - Partials 261 262 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mozzy11 commented 3 months ago

Hello @bashir2 , Generally ive noticed the tests consistently fail when i run controller pipelines to sync data to a parquest DWH and a FHIR sink simultaneously , basically because it consumes too much reaources and the CI build fails.

Ive just adedd empty params for the fhir sink so that data is not synced to fhir sink to a by default . Ill work on another PR to to enable syncing data to a FHIR sink independently wothout dependening on the Parquet DWH

bashir2 commented 3 months ago

Hello @bashir2 , Generally ive notoced the tests consistently fail when i run controller pipelines to sync data to a parquest DWH and a FHIR sink simultaneously , basically because it consumes too much reaources and the CI build fails.

Thanks @mozzy11 for working on this. What evidence do you have that the root cause of controller failures is a resource starvation issue? Asking because I took a look at some of your failed e2e runs and it was not obvious to me that is the reason (for example this one from last week). Yes, it seems that the controller fails but I am not sure if it is due to resource issues. Can you please reproduce that scenario then following the instructions in this comment to see the logs from the controller?

Ive just adedd empty params for the fhir sink so that data is not synced to fhir sink to a by default . Ill work on another PR to to enable syncing data to a FHIR sink independently wothout dependening on the Parquet DWH

I prefer if we have e2e tests that cover this feature, especially because it is a little bit different than usual scenarios we test (for analytics queries); otherwise we may break this in the future without noticing.

BTW, please also respond to the comments from the last review and resolve those that are addressed.

mozzy11 commented 2 months ago

@bashir2 i investigated the cause of the failure and fixed it.

mozzy11 commented 2 months ago

CC @bashir2

bashir2 commented 2 months ago

Thanks @mozzy11 for debugging and the changes; I'll take a look soon. In the dev. call tomorrow, do you mind discussing this and the FHIR-server to FHIR-server sync scenario in general? I just want to make sure we are on the same page regarding the shortcomings of the current approach and your plans for how to use this in production.

mozzy11 commented 2 months ago

Thanks @bashir2 . i will discuss