TDP should detect the following types of duplicate records, not store them in the db, and return an error message in the feedback report:
Exact duplicate record: this is defined as multiple non-header/non-trailer records in a file that are exactly the same
Partial duplicate record: this is defined as multiple non-header/non-trailer records that share the same values for some key data elements. (this definition will vary by record type)
Logic with examples by section and record type can be found here
In section 1 files, exact and partial duplicate records are detected, not stored in db, and error message included in the feedback report.
[ ] In section 2 files, exact and partial duplicate records are detected, not stored in db, and error message included in the feedback report.
[ ] section 3 files, partial duplicate T6 records are detected, not stored in db, and error message included in the feedback report.
[ ] section 4 files, partial duplicate T7 records are detected, not stored in db, and error message included in the feedback report.
[ ] duplicate detection logic is applied consistently to TANF, SSP, and Tribal TANF files.
[ ] Testing Checklist has been run and all tests pass
[ ] README is updated, if necessary
Acceptance Criteria:Create a list of functional outcomes that must be achieved to complete this issue
[ ] Spike - investigate an approach, decide how to split remaining work into multiple tickets
Tasks:Create a list of granular, specific work items that must be completed to deliver the desired outcomes of this issue
[ ] Data structure to store all records (rpt_month_year/case_number) during parsing
potential mem risk
possibly store hash record_type/rpt_month_year/case_number
[ ] Check against DS for duplicates
possible second task to revert any unwanted records
[ ] include performance testing (?)
[ ] Run Testing Checklist and confirm all tests pass
Notes:Add additional useful information, such as related issues and functionality that isn't covered by this specific issue, and other considerations that will be helpful for anyone reading this
relevant to TANF, Tribal TANF, and SSP section 1 and 2 files
May consider a separate cache class, similar to cat4, but that acts before records are stored/added to the bulk_create obj
Supporting Documentation:Please include any relevant log snippets/files/screen shots
Doc 1
Doc 2
Open Questions:Please include any questions or decisions that must be made before beginning work or to confidently call this issue complete
Description:
TDP should detect the following types of duplicate records, not store them in the db, and return an error message in the feedback report:
Exact duplicate record: this is defined as multiple non-header/non-trailer records in a file that are exactly the same
Partial duplicate record: this is defined as multiple non-header/non-trailer records that share the same values for some key data elements. (this definition will vary by record type)
Logic with examples by section and record type can be found here
Acceptance Criteria: Create a list of functional outcomes that must be achieved to complete this issue
Tasks: Create a list of granular, specific work items that must be completed to deliver the desired outcomes of this issue
Notes: Add additional useful information, such as related issues and functionality that isn't covered by this specific issue, and other considerations that will be helpful for anyone reading this
Supporting Documentation: Please include any relevant log snippets/files/screen shots
Open Questions: Please include any questions or decisions that must be made before beginning work or to confidently call this issue complete