wri / gfw-datapump

Nightly batch process to generate summary statistics for user AOIs
3 stars 2 forks source link

Add unique constraints to geotrellis tables to prevent duplicate rows #140

Closed jterry64 closed 2 months ago

jterry64 commented 8 months ago

Pull request checklist

Please check if your PR fulfills the following requirements:

Pull request type

Please check the type of change your PR introduces: - [ ] Bugfix - [X] Feature - [ ] Code style update (formatting, renaming) - [ ] Refactoring (no functional changes, no api changes) - [ ] Build related changes - [ ] Documentation content changes - [ ] Other (please describe): ## What is the current behavior? Occasionally certain tables get duplicate rows when syncing, causing incorrect results. This happens most often for custom user areas. Issue Number: GTC-2484 ## What is the new behavior? Add a unique constraint on the rows to make sure we don't duplicate any combo of IDs/filters/categories. This will enforce the data is consistent. This is quite a few columns, almost 30 for TCL analysis, since we have so many categorical or boolean columns. So writing might slow down, but this happens in the background so shouldn't affect production systems. ## Does this introduce a breaking change? - [ ] Yes - [X] No