3064 -Reparse Meta Model

elipe17 commented 4 months ago

Summary of Changes

Created new meta model for reparse executions
Added logic to enforce sequential reparse execution
Added timeout logic to reparse execution as a failsafe
Moved datafiles admin config to separate directory
Added reparse event filtering on datafiles admin page
Removed Elastic options from reparse command. -a enforces new indices and that is the only time they are recreated.
Updated celery log level and forced single task execution to preserve memory
Converted "re-parse" to "reparse" to unify language Pull request closes #3064

How to Test

cd tdrs-frontend && docker-compose up
cd tdrs-backend && docker-compose up --build

Open http://localhost:3000/ and sign in.
Submit some files
execute reparse command and try to break it

Deliverables

More details on how deliverables herein are assessed included here.

Deliverable 1: Accepted Features

Checklist of ACs:

[x] Meta data model is created
[x] Meta data model is serialized to the DB per run of the clean_and_reparse command
[x] Migration is named appropriately
[x] Re-parse commands are guarenteed to run sequentially
[x] Re-parse model "re-parse completed" datetime field has the correct datetime logic
[x] Testing Checklist has been run and all tests pass
[x] README is updated, if necessary

Deliverable 2: Tested Code

Are all areas of code introduced in this PR meaningfully tested?
- [ ] If this PR introduces backend code changes, are they meaningfully tested?
- [ ] If this PR introduces frontend code changes, are they meaningfully tested?
Are code coverage minimums met?
- [x] Frontend coverage: [insert coverage %] (see CodeCov Report comment in PR)
- [x] Backend coverage: [insert coverage %] (see CodeCov Report comment in PR)

Deliverable 3: Properly Styled Code

[x] Are backend code style checks passing on CircleCI?
[x] Are frontend code style checks passing on CircleCI?
[x] Are code maintainability principles being followed?

Deliverable 4: Accessible

[ ] Does this PR complete the epic?
[ ] Are links included to any other gov-approved PRs associated with epic?
[ ] Does PR include documentation for Raft's a11y review?
[ ] Did automated and manual testing with iamjolly and ttran-hub using Accessibility Insights reveal any errors introduced in this PR?

Deliverable 5: Deployed

[x] Was the code successfully deployed via automated CircleCI process to development on Cloud.gov?

Deliverable 6: Documented

[x] Does this PR provide background for why coding decisions were made?
[ ] If this PR introduces backend code, is that code easy to understand and sufficiently documented, both inline and overall?
[ ] If this PR introduces frontend code, is that code easy to understand and sufficiently documented, both inline and overall?
[ ] If this PR introduces dependencies, are their licenses documented?
[x] Can reviewer explain and take ownership of these elements presented in this code review?

Deliverable 7: Secure

[x] Does the OWASP Scan pass on CircleCI?
[ ] Do manual code review and manual testing detect any new security issues?
[ ] If new issues detected, is investigation and/or remediation plan documented?

Deliverable 8: User Research

Research product(s) clearly articulate(s):

[ ] the purpose of the research
[ ] methods used to conduct the research
[ ] who participated in the research
[ ] what was tested and how
[ ] impact of research on TDP
[ ] (if applicable) final design mockups produced for TDP development

codecov[bot] commented 3 months ago

Files	Patch %	Lines
...d/tdpservice/search_indexes/models/reparse_meta.py	49.35%	37 Missing and 2 partials :warning:
...drs-backend/tdpservice/data_files/admin/filters.py	51.85%	13 Missing :warning:
tdrs-backend/tdpservice/search_indexes/util.py	37.50%	5 Missing :warning:
tdrs-backend/tdpservice/parsers/parse.py	83.33%	2 Missing :warning:
tdrs-backend/tdpservice/scheduling/parser_task.py	33.33%	2 Missing :warning:
tdrs-backend/tdpservice/data_files/admin/admin.py	90.00%	1 Missing :warning:

[![Impacted file tree graph](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126/graphs/tree.svg?width=650&height=150&src=pr&token=BA04YXPAL9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech)](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) ```diff @@ Coverage Diff @@ ## develop #3126 +/- ## =========================================== - Coverage 91.65% 91.07% -0.59% =========================================== Files 278 284 +6 Lines 7623 7766 +143 Branches 697 711 +14 =========================================== + Hits 6987 7073 +86 - Misses 532 587 +55 - Partials 104 106 +2 ``` | [Flag](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | Coverage Δ | | |---|---|---| | [dev-backend](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | `90.84% <60.50%> (-0.67%)` | :arrow_down: | | [dev-frontend](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | `92.60% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#carryforward-flags-in-the-pull-request-comment) to find out more. | [Files](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | Coverage Δ | | |---|---|---| | [...ata\_files/migrations/0013\_datafile\_reparse\_meta.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fdata_files%2Fmigrations%2F0013_datafile_reparse_meta.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2UvZGF0YV9maWxlcy9taWdyYXRpb25zLzAwMTNfZGF0YWZpbGVfcmVwYXJzZV9tZXRhLnB5) | `100.00% <100.00%> (ø)` | | | [tdrs-backend/tdpservice/data\_files/models.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fdata_files%2Fmodels.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2UvZGF0YV9maWxlcy9tb2RlbHMucHk=) | `79.72% <100.00%> (+0.13%)` | :arrow_up: | | [...ackend/tdpservice/search\_indexes/admin/\_\_init\_\_.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fsearch_indexes%2Fadmin%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2VhcmNoX2luZGV4ZXMvYWRtaW4vX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | | | [...nd/tdpservice/search\_indexes/admin/reparse\_meta.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fsearch_indexes%2Fadmin%2Freparse_meta.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2VhcmNoX2luZGV4ZXMvYWRtaW4vcmVwYXJzZV9tZXRhLnB5) | `100.00% <100.00%> (ø)` | | | [...arch\_indexes/migrations/0030\_reparse\_meta\_model.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fsearch_indexes%2Fmigrations%2F0030_reparse_meta_model.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2VhcmNoX2luZGV4ZXMvbWlncmF0aW9ucy8wMDMwX3JlcGFyc2VfbWV0YV9tb2RlbC5weQ==) | `100.00% <100.00%> (ø)` | | | [...ckend/tdpservice/search\_indexes/models/\_\_init\_\_.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fsearch_indexes%2Fmodels%2F__init__.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2VhcmNoX2luZGV4ZXMvbW9kZWxzL19faW5pdF9fLnB5) | `100.00% <100.00%> (ø)` | | | [tdrs-backend/tdpservice/settings/common.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fsettings%2Fcommon.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2V0dGluZ3MvY29tbW9uLnB5) | `99.31% <100.00%> (+<0.01%)` | :arrow_up: | | [tdrs-backend/tdpservice/data\_files/admin/admin.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fdata_files%2Fadmin%2Fadmin.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2UvZGF0YV9maWxlcy9hZG1pbi9hZG1pbi5weQ==) | `91.42% <90.00%> (ø)` | | | [tdrs-backend/tdpservice/parsers/parse.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fparsers%2Fparse.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2UvcGFyc2Vycy9wYXJzZS5weQ==) | `83.66% <83.33%> (-0.02%)` | :arrow_down: | | [tdrs-backend/tdpservice/scheduling/parser\_task.py](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree&filepath=tdrs-backend%2Ftdpservice%2Fscheduling%2Fparser_task.py&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#diff-dGRycy1iYWNrZW5kL3RkcHNlcnZpY2Uvc2NoZWR1bGluZy9wYXJzZXJfdGFzay5weQ==) | `39.02% <33.33%> (-0.45%)` | :arrow_down: | | ... and [3 more](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | | ------ [Continue to review full report in Codecov by Sentry](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?dropdown=coverage&src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). Last update [70cd922...2226a3e](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3126?dropdown=coverage&src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech).

jtimpe commented 3 months ago

I get a duplicate key error and lose all record data when trying to replicate the "double button click" (sequential execution) for which you removed the handling. just want to highlight the risk there

[2024-08-09 14:22:13,973: ERROR/ForkPoolWorker-12] Encountered Database exception in parser_task.py: 
web-1           | duplicate key value violates unique constraint "parsers_datafilesummary_datafile_id_880a2f4d_uniq"
web-1           | DETAIL:  Key (datafile_id)=(4) already exists.

raftmsohani commented 3 months ago

@elipe17 I am still not clear why we need many to many relationship, and I would like to avoid them if possible. There are many things that can go wrong with them, and leave junk data in the DB. Many if you could elaborate more on why many-to-many is needed I can be convinced!

andrew-jameson commented 3 months ago

Both locally and on a11y, I am not getting a finished reparse run. I initially just used -a to gather the dozen or so files i had uploaded then tried breaking it up by year, same results. I waited for logs to indicate no more parsing was happening before initiating the next run. Used task clean prior to building so its completely fresh. a11y also has a fresh DB due to some issues. Will retry against raft env.

Screenshots

![Screenshot 2024-08-15 at 11 44 34 AM](https://github.com/user-attachments/assets/11b3ad8f-cafc-4948-bb47-9eb7e0ce83e8) ![Screenshot 2024-08-15 at 11 56 28 AM](https://github.com/user-attachments/assets/2abdba46-0516-42fe-8b63-0c96f8c1fe22) ![Screenshot 2024-08-15 at 11 56 43 AM](https://github.com/user-attachments/assets/ce449fa2-6736-4896-8626-c99af8633eb2)

2024-08-15 11:55:21 [2024-08-15 15:55:21,346: INFO/ForkPoolWorker-52] DataFile parsing started for file ADS.E2J.NDM1.TS01
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG fields.py::parse_value:L47 :  Field: 'tribe_code' at position: [14, 17) is empty.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG fields.py::parse_value:L47 :  Field: 'tribe_code' at position: [14, 17) is empty.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L46 :  Datafile has encrypted fields: True.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L47 :  Datafile: {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06}, is Tribal: False.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L51 :  Program type: TAN, Section: A.
2024-08-15 11:55:21 2024-08-15 15:55:21,427 INFO parse.py::parse_datafile:L95 :  Preparser Error -> Rpt Month Year is not valid: Submitted reporting year:2020, quarter:Q4 doesn't match file reporting year:2023, quarter:Q1.
2024-08-15 11:55:21 2024-08-15 15:55:21,427 DEBUG parse.py::bulk_create_errors:L155 :  Bulk creating ParserErrors.
2024-08-15 11:55:21 2024-08-15 15:55:21,429 INFO parse.py::bulk_create_errors:L158 :  Created 1/1 ParserErrors.
2024-08-15 11:55:21 2024-08-15 15:55:21,439 INFO parser_task.py::parse:L41 :  Parsing finished for file -> {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06} with status Rejected and 1 errors.
2024-08-15 11:55:21 [2024-08-15 15:55:21,439: INFO/ForkPoolWorker-52] Parsing finished for file -> {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06} with status Rejected and 1 errors.
2024-08-15 11:55:21 [2024-08-15 15:55:21,441: INFO/ForkPoolWorker-52] Task tdpservice.scheduling.parser_task.parse[22642997-1f01-4f61-bc74-0cef780d0247] succeeded in 0.10376212500000292s: None

elipe17 commented 3 months ago

Both locally and on a11y, I am not getting a finished reparse run. I initially just used -a to gather the dozen or so files i had uploaded then tried breaking it up by year, same results. I waited for logs to indicate no more parsing was happening before initiating the next run. Used task clean prior to building so its completely fresh. a11y also has a fresh DB due to some issues. Will retry against raft env.

Screenshots

2024-08-15 11:55:21 [2024-08-15 15:55:21,346: INFO/ForkPoolWorker-52] DataFile parsing started for file ADS.E2J.NDM1.TS01
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG fields.py::parse_value:L47 :  Field: 'tribe_code' at position: [14, 17) is empty.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG fields.py::parse_value:L47 :  Field: 'tribe_code' at position: [14, 17) is empty.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L46 :  Datafile has encrypted fields: True.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L47 :  Datafile: {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06}, is Tribal: False.
2024-08-15 11:55:21 2024-08-15 15:55:21,426 DEBUG parse.py::parse_datafile:L51 :  Program type: TAN, Section: A.
2024-08-15 11:55:21 2024-08-15 15:55:21,427 INFO parse.py::parse_datafile:L95 :  Preparser Error -> Rpt Month Year is not valid: Submitted reporting year:2020, quarter:Q4 doesn't match file reporting year:2023, quarter:Q1.
2024-08-15 11:55:21 2024-08-15 15:55:21,427 DEBUG parse.py::bulk_create_errors:L155 :  Bulk creating ParserErrors.
2024-08-15 11:55:21 2024-08-15 15:55:21,429 INFO parse.py::bulk_create_errors:L158 :  Created 1/1 ParserErrors.
2024-08-15 11:55:21 2024-08-15 15:55:21,439 INFO parser_task.py::parse:L41 :  Parsing finished for file -> {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06} with status Rejected and 1 errors.
2024-08-15 11:55:21 [2024-08-15 15:55:21,439: INFO/ForkPoolWorker-52] Parsing finished for file -> {id: 21, filename: ADS.E2J.FTP1.TS06, STT: Alabama (01), S3 location: data_files/2023/Q1/1/Active Case Data/ADS.E2J.FTP1.TS06} with status Rejected and 1 errors.
2024-08-15 11:55:21 [2024-08-15 15:55:21,441: INFO/ForkPoolWorker-52] Task tdpservice.scheduling.parser_task.parse[22642997-1f01-4f61-bc74-0cef780d0247] succeeded in 0.10376212500000292s: None

@andrew-jameson the code that handles tracking failed files (think S3 exception we don't catch) or files that exit parsing early due to cat1 errors is in the follow on PR since it is required for sequential execution and not general metadata tracking.

andrew-jameson commented 3 months ago

Usability change for sysadmins and developers: Data Files page filter by a ReparseMeta model object.

for my own notes:

meta7 = ReparseMeta.objects.get(id=7)
datafiles = DataFile.objects.filter(reparse_meta_models=meta7)
# equivalent is ~ meta7.reparse_meta_models.all()
[print("{}:{}".format(d.stt,d.fiscal_year)) for d in datafiles]

## files associated with 
Arkansas (05):2024 - Q1 (Oct - Dec)
Arkansas (05):2024 - Q1 (Oct - Dec)
Alabama (01):2024 - Q1 (Oct - Dec)
Chippewa-Cree Indians of the Rocky Boy's Reservation (043):2024 - Q1 (Oct - Dec)
Chippewa-Cree Indians of the Rocky Boy's Reservation (043):2024 - Q1 (Oct - Dec)
Chippewa-Cree Indians of the Rocky Boy's Reservation (043):2024 - Q1 (Oct - Dec)
Chippewa-Cree Indians of the Rocky Boy's Reservation (043):2024 - Q1 (Oct - Dec)
Florida (12):2024 - Q1 (Oct - Dec)
Florida (12):2024 - Q1 (Oct - Dec)
Alabama (01):2024 - Q1 (Oct - Dec)
Arkansas (05):2024 - Q1 (Oct - Dec)
Alabama (01):2024 - Q1 (Oct - Dec)

ADPennington commented 3 months ago

per standup today #3064 and #3065 work reflected in this PR. I started testing this morning.

ADPennington commented 3 months ago

@elipe17 @andrew-jameson @jtimpe @raftmsohani I'm currently blocked on testing this PR in qasp environment. I attempted to reparse this morning for FY2023 Q1 and the operation was killed after the backup was completed. evidence below ⬇️

2024-08-24 13:54:44,539 INFO clean_and_reparse.py::__backup:L49 :  Backup complete! Commencing clean and reparse.
Backup complete! Commencing clean and reparse.
Killed

Screenshot 2024-08-24 100459

I then tried another quarter: FY2023Q2 and couldn't proceed:

vcap@fc474368-25ed-4bfd-51b7-c201:~$ python manage.py clean_and_reparse -y 2023 -q Q2

You have selected to reparse datafiles for FY 2023 and Q2. The reparsed files will NOT be stored in new indices and the old indices
These options will delete and reparse (20) datafiles.
Continue [y/n]? y
The latest ReparseMeta model's (ID: 2) timeout_at field is None. Cannot safely execute reparse, please fix manually.

Worth noting that FY23Q1 has a couple of large files that should generate a lot of errors, so I'd like to see how this operation performs before this in prod.

elipe17 commented 3 months ago

@elipe17 @andrew-jameson @jtimpe @raftmsohani I'm currently blocked on testing this PR in qasp environment. I attempted to reparse this morning for FY2023 Q1 and the operation was killed after the backup was completed. evidence below ⬇️
2024-08-24 13:54:44,539 INFO clean_and_reparse.py::__backup:L49 :  Backup complete! Commencing clean and reparse.
Backup complete! Commencing clean and reparse.
Killed
I then tried another quarter: FY2023Q2 and couldn't proceed:
vcap@fc474368-25ed-4bfd-51b7-c201:~$ python manage.py clean_and_reparse -y 2023 -q Q2

You have selected to reparse datafiles for FY 2023 and Q2. The reparsed files will NOT be stored in new indices and the old indices
These options will delete and reparse (20) datafiles.
Continue [y/n]? y
The latest ReparseMeta model's (ID: 2) timeout_at field is None. Cannot safely execute reparse, please fix manually.
Worth noting that FY23Q1 has a couple of large files that should generate a lot of errors, so I'd like to see how this operation performs before this in prod.

@ADPennington I updated the meta model in qasp so that you can continue testing. The Killed console output indicates to me that the process was killed for some reason. I can't go far enough back in the logs to see if I can see exactly what happened.

ADPennington commented 3 months ago

@elipe17 latest test notes/questions below ⬇️ I didn't observe anything that needs to be addressed in this ticket; this is mostly for my SA.

Is there a way to know which source file id(s) are associated with the difference between the deleted/created counts? Looks like after the reparsing, the record count is different, which will sometimes be the case when validation is updated, but I imagine we'd also want to be able to investigate files to check if something went wrong? (see below):
what's the difference between total # of records initial and # of records created? Is one capturing the number of records in files vs number of records in the db after reparsing?
are we replacing the records in the db or adding new records? After reparsing FY23Q3, I see 6628 TANF T4s, and 3314 "new" TANF T4s. I was kind of expecting to see only "new" TANF T4s == "all" TANF T4s for this fiscal period. I'm assuming this is because more than one version of the FY23Q3 file was subject to reparsing? (see below). If true, this is another good justification for why we want to control which versions get reparsed (i.e. most recent :smile:)
mentioned this async too, so this is just for reference, would be helpful for admins to know how to "fix manually" when observe logentries like the following: The latest ReparseMeta model's (ID: 2) timeout_at field is None. Cannot safely execute reparse, please fix manually.

elipe17 commented 3 months ago

@elipe17 latest test notes/questions below ⬇️ I didn't observe anything that needs to be addressed in this ticket; this is mostly for my SA.

Is there a way to know which source file id(s) are associated with the difference between the deleted/created counts? Looks like after the reparsing, the record count is different, which will sometimes be the case when validation is updated, but I imagine we'd also want to be able to investigate files to check if something went wrong? (see below):

what's the difference between total # of records initial and # of records created? Is one capturing the number of records in files vs number of records in the db after reparsing?

are we replacing the records in the db or adding new records? After reparsing FY23Q3, I see 6628 TANF T4s, and 3314 "new" TANF T4s. I was kind of expecting to see only "new" TANF T4s == "all" TANF T4s for this fiscal period. I'm assuming this is because more than one version of the FY23Q3 file was subject to reparsing? (see below). If true, this is another good justification for why we want to control which versions get reparsed (i.e. most recent 😄)

mentioned this async too, so this is just for reference, would be helpful for admins to know how to "fix manually" when observe logentries like the following: The latest ReparseMeta model's (ID: 2) timeout_at field is None. Cannot safely execute reparse, please fix manually.

@ADPennington, see my responses below :).

The record count is/can be different for files that have not been cat4 validated. Since records with cat4 errors don't get serialized to the DB we can expect to see the "num records" fields to not always be a one-to-one match since cat4 is relatively new. We can write a spike ticket to investigate the feasibility of tracking before and after record counts for files, this ticket might be a way for us to get that information. As an intermediary, I have also written this ticket which adds some more useful fields. Specifically tracking cat4 errors before and after the reparse will help illuminate if it makes sense that the record counts have diverged.
The two fields Total num records initial and Total num records post indicate the total number of records in the DB before and after the reparse event. The Total num records deleted field indicates how many records this reparse event deleted from the DB and the Total num records created indicates how many records were re-created during the reparse event.
For reparsing, all records associated with the selected files are deleted from the DB and then recreated. No record duplication should be occuring.
See the steps below to "fix manually"
1. Ssh into the appropriate backend
2. Enter the Django Shell Plus
3. Execute:
```
latest = ReparseMeta.get_latest()
```
4. Execute:
```
latest.timeout_at = timezone.now()
```
5. Execute:
```
latest.save()
```
6. You should now be able to execute a new reparse command.

ADPennington commented 3 months ago

The record count is/can be different for files that have not been cat4 validated. Since records with cat4 errors don't get serialized to the DB we can expect to see the "num records" fields to not always be a one-to-one match since cat4 is relatively new. We can write a spike ticket to investigate the feasibility of tracking before and after record counts for files, this ticket might be a way for us to get that information. As an intermediary, I have also written this ticket which adds some more useful fields. Specifically tracking cat4 errors before and after the reparse will help illuminate if it makes sense that the record counts have diverged.

per async with @elipe17 #3096 is the ticket intended to capture more details about cat1 and cat4 errors in data file summaries. linking just for reference to related ideas 😄

raft-tech / TANF-app