[Task]: Spike for DB Planning

Context

We've started the work on the #48. In this issue, we want to perform a technical spike to identify shortcomings and must-haver requirements that will support the design of the new data model. The goal is to gather information on the problem landscape with the current DB, requirements, and use cases. The previous team has implemented several DB performance improvements which should also be considered.

There are a few open questions that are also needed to be answered. You can find them in the the milestone doc introduced by #104

Outcomes

The spike will provide a comprehensive understanding of the challenges and requirements related to the current database. It will help in scoping the DB work effectively and guide the design of a new data model that addresses identified shortcomings and aligns with stakeholder and user needs.

Acceptance criteria

[ ] Spike document created that provides information to support the design of the new data model Questions have been reviewed and answered with support of HHS staff
[ ] The answers to all of the questions have been documented in the milestone document
- [ ] Will this database serve as an OLTP system, OLAP system, or both? Will this database be used for analytics, and is there a need for a second database to be created with data pipelines?
- [ ] What are the current datastores that support the legacy Grants.gov system?
- [ ] Is the existing Grants.gov database heavy on read or write transactions?
- [ ] How many external sources does the existing database have?

Notes

Decreased application process time by over 90%(from 2+ minutes to under 12 seconds)
Reduced query run times by an average of 90% by tuning top 25 database queries
Reduced login time by 66% (from 3 seconds to under 1 second)
Reduced time to extract information from the SAM directory by 96.5%(from over 22 hours each month to under 45 minutes)
Moved 60% of database cron jobs to scheduler jobs, improving monitoring and success rates

HHS / simpler-grants-gov