google / fhir-data-pipes

A collection of tools for extracting FHIR resources and analytics services on top of that data.
https://google.github.io/fhir-data-pipes/
Apache License 2.0
140 stars 80 forks source link

Added S3 dep and some version updates #1066

Closed bashir2 closed 1 month ago

bashir2 commented 1 month ago

Description of what I changed

Fixes #1011 It also updates some deps/versions.

E2E test

TESTED:

Did a sample run with a S3 bucket for the output and verified the output files are created (note this needed AWS_REGION, AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID to be set):

java -cp ./pipelines/batch/target/batch-bundled.jar com.google.fhir.analytics.FhirEtl --sourceJsonFilePattern=[PATH]/synthea-hiv/768_patients/* --resourceList="Patient" --parallelism=30 --outputParquetPath=s3://[BUCKET]/parquet_from-json_768_flink --runner=FlinkRunner

Checklist: I completed these to help reviewers :)

codecov-commenter commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 50.60%. Comparing base (5eb81a2) to head (2fe1dff).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1066 +/- ## ========================================= Coverage 50.60% 50.60% Complexity 674 674 ========================================= Files 91 91 Lines 5511 5511 Branches 707 707 ========================================= Hits 2789 2789 Misses 2461 2461 Partials 261 261 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.