Closed maxkadel closed 1 month ago
I think the change of 9,000 rows was a typo in the previous ticket.
The original data contained a lot of duplicates. When I repeated the deduplication with last name, first name, year, pubfile, and academicfile, I got the same exact result as the previous deduplication. It seems like there were some duplicates that were added over the years in the old database. No unique data has been lost.
What maintenance needs to be done?
Investigate record number discrepancy. From stakeholder:
Level of urgency
Why is this maintenance needed?
Acceptance criteria
Implementation notes, if any
There was a ticket to remove duplicate rows from this index, described in https://github.com/pulibrary/static-tables/issues/142
We noticed a discrepancy between the numbers removed in https://github.com/pulibrary/static-tables/issues/142 and the discrepancy that @LynnDurgin references. Check the number of rows in the source table from mudd-dbs and figure out why instead of 9,000 less we're seeing 5,000 less records.