Pull Request

Description

Adding additional tools and code optimizations that were needed to analyze performance bottlenecks in the record linkage API.

Related Issues

closes #71

Additional Notes

Please take a look at some accompanying videos on finding the optimizations for more context.

seed data

A new script, seed_db.sh was added to preload the database with existing records before running the performance test. While the performance tests themselves do this, a seed file can vastly shorten the time necessary to run the tests if you want to test how an API performs with existing records in the MPI.

pgbadger

pgbadger was added to help analyze query locking and performance during the tests. To facilitate this analyze, some tooling needed to be added and changes made to the postgres configuration to capture the data necessary for analysis.

code optimizations

Changes were made to dal.py, mpi.py and link.py (with the help of an env variable flag) to test optimizations on potential bottlenecks. Additionally, the analyze_trace_timings.sh was added to analyze the results of performance test traces exported from jaeger. The changes have also been put into a phdi PR for the DIBBs team to review.

custom API health check

Added api_health_check.sh script to reduce the number of GET requests made to the API during the test.

synthea split option

Added an optional parameter for splitting synthea encounters into multiple files. This gives us the option to send multiple API requests for a patient if more than one encounter was generated by synthea.

Checklist

Please review and complete the following checklist before submitting your pull request:

[x] I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
[x] I have reviewed my changes to ensure they are clear, concise, and well-documented.
[x] I have updated the documentation, if applicable.
[x] I have added or updated test cases to cover my changes, if applicable.
[x] I have minimized the number of reviewers to include only those essential for the review.
[ ] I have notified teammates in the review thread to build awareness.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

[ ] The code follows best practices and conventions.
[ ] The changes implement the desired functionality or fix the reported issue.
[ ] The tests cover the new changes and pass successfully.
[ ] Any potential edge cases or error scenarios have been considered.

CDCgov / IDWA

Ericbuckley/idwa 71 analyze rl performance #83