fedspendingtransparency / usaspending-api

Server application to serve U.S. federal spending data via a RESTful API
https://www.usaspending.gov
Creative Commons Zero v1.0 Universal
310 stars 113 forks source link

[DEV-10453] Check for modified awards and transactions before FY2008 #4034

Closed aguest-kc closed 8 months ago

aguest-kc commented 9 months ago

Description: Adds a new check to the elasticsearch_indexer and elasticsearch_indexer_for_spark commands when using the --process-deletes flag. This checks for transactions/awards that have been modified in the past 3 days and that have an action_date before FY2008 If any transactions/awards are found then we attempt to delete those records from Elasticsearch.

This is for a scenario where a transaction/award is create with an action_date on or after 2007-10-01, is then added to Elasticsearch and then the transaction/award's action_date is later changed to be before 2007-10-01. This causes the transaction/award record in Elasticsearch to contain the old data and not be updated. The only solution to this is a full reindex of the affected index.

Technical details: Adds a new check to the elasticsearch_indexer and elasticsearch_indexer_for_spark commands when using the --process-deletes flag. This checks for transactions/awards that have been modified in the past 3 days and that have an action_date before FY2008 If any transactions/awards are found then we attempt to delete those records from Elasticsearch.

Requirements for PR merge:

  1. [x] Unit & integration tests updated
  2. [ ] Necessary PR reviewers:
    • [ ] Backend
  3. [x] Data validation completed
  4. [ ] Appropriate Operations ticket(s) created
  5. [x] Jira Ticket DEV-10453:
    • [x] Link to this Pull-Request
    • [x] Performance evaluation of affected (API | Script | Download)
    • [x] Before / After data comparison

Area for explaining above N/A when needed:

2. API documentation updated
No API documentation is affected by this change.

4. Matview impact assessment completed
Matviews are not affected by this change.

5. Frontend impact assessment completed
The frontend is not impacted by this change.