clearlydefined / operations

Operational documentation and tools
2 stars 10 forks source link

add tool to check for out-of-sync data between DB and production-definitions blob store #84

Open elrayle opened 2 months ago

elrayle commented 2 months ago

Co-authored-by: @ajhenry

Description

The new tool, analyze_data_synchronization, checks for out-of-sync data between the database and the production-definitions blob store. The tool can be for multiple months, one month at a time, or for a custom date range. The tool outputs a JSON file with summary stats and the invalid data. The tool is controlled through a .env file, which can be customized to specify the start and end dates, the maximum number of documents to process, and the output file name.

See README for examples.

Minor fix

The README includes a fix to rename production-snapshots to changes-notifications. The switch to changes-notifications has been in production use since January 2024.

ajhenry commented 2 months ago

Nice this looks great! Thanks a ton for cleaning this script up