This is the last piece that fixes #916. Beside adding the incremental view update option to the controller, this also implements a fast view recreation option. This is done by reading the whole DWH snapshot (the Parquet files), convert them back to HAPI objects, and apply the new view logic on them. This can be an order of magnitude (or more) faster than refetching resources from the FHIR server. For a fairly large DWH (which originally took ~200 minutes to create by reading from the FHIR server), recreating views only took ~13 to 18 minutes (depending on old views size/presence). From this time, less than 2 minutes was actually converting Avro to HAPI.
Note we never used the Avro to HAPI feature of Bunsen in our production code; this first usage uncovered some new bugs which are fixed in this PR as well.
E2E test
TESTED:
Ran the controller and tested the FULL and VIEWS modes. See above for performance testing.
Description of what I changed
This is the last piece that fixes #916. Beside adding the incremental view update option to the
controller
, this also implements a fast view recreation option. This is done by reading the whole DWH snapshot (the Parquet files), convert them back to HAPI objects, and apply the new view logic on them. This can be an order of magnitude (or more) faster than refetching resources from the FHIR server. For a fairly large DWH (which originally took ~200 minutes to create by reading from the FHIR server), recreating views only took ~13 to 18 minutes (depending on old views size/presence). From this time, less than 2 minutes was actually converting Avro to HAPI.Note we never used the Avro to HAPI feature of Bunsen in our production code; this first usage uncovered some new bugs which are fixed in this PR as well.
E2E test
TESTED:
Ran the controller and tested the FULL and VIEWS modes. See above for performance testing.
Checklist: I completed these to help reviewers :)
[x] I have read and will follow the review process.
[x] I am familiar with Google Style Guides for the language I have coded in.
No? Please take some time and review Java and Python style guides.
[x] My IDE is configured to follow the Google code styles.
No? Unsure? -> configure your IDE.
[x] I have added tests to cover my changes. (If you refactored existing code that was well tested you do not have to add tests)
[x] I ran
mvn clean package
right before creating this pull request and added all formatting changes to my commit.[x] All new and existing tests passed.
[x] My pull request is based on the latest changes of the master branch.
No? Unsure? -> execute command
git pull --rebase upstream master