GSA-TTS / FAC

GSA's Federal Audit Clearinghouse
Other
18 stars 5 forks source link

[ADR] Handling Incorrect Audit Reports During Migration #3809

Closed sambodeme closed 3 months ago

sambodeme commented 4 months ago

Areas of impact

Related documents/links

Context

As we migrate audit reports from Census table files to GSA/FAC database, we encounter a subset of reports with data validation issues that prevent their easy curation during the migration process. The complexity and nature of these data issues necessitate a strategy where these reports are migrated "as is", without undergoing validation or correction initially. This approach is deemed necessary to ensure all records are preserved intact for subsequent analysis and possible rectification.

Decision

We propose to introduce two new Django models to handle these specific cases effectively:

  1. *InvalidAuditRecord**:

    • This model acts similarly to the existing MigrationInspectionRecord, but instead of holding change records, it holds records that have been migrated as is, without validation or changes.
    • The model will record the basic metadata of each report and the specific reasons it was migrated without validation.
  2. IssueDescriptionRecord:

    • This model will capture detailed descriptions of the issues associated with each record, explaining why the reports could not be validated or corrected during the initial migration.
    • Each InvalidAuditRecord can have one or more associated IssueDescriptionRecord entries, establishing a parent-child relationship.

Proposed Model Structures

InvalidAuditRecord

class InvalidAuditRecord(models.Model):
    audit_year = models.TextField(blank=True, null=True)
    dbkey = models.TextField(blank=True, null=True)
    report_id = models.TextField(blank=True, null=True)
    run_datetime = models.DateTimeField(default=timezone.now)
    finding_text = models.JSONField(blank=True, null=True)  # Described below
    additional_uei = models.JSONField(blank=True, null=True)
    additional_ein = models.JSONField(blank=True, null=True)
    finding = models.JSONField(blank=True, null=True)
    federal_award = models.JSONField(blank=True, null=True)
    cap_text = models.JSONField(blank=True, null=True)
    note = models.JSONField(blank=True, null=True)
    passthrough = models.JSONField(blank=True, null=True)
    general = models.JSONField(blank=True, null=True)
    secondary_auditor = models.JSONField(blank=True, null=True)

IssueDescriptionRecord

class IssueDescriptionRecord(models.Model):
    issue_detail = models.TextField()
    issue_tag = models.TextField()
    skipped_validation_method = models.TextField()

JSONField Content Structure for InvalidAuditRecord

{
    "census_data": [{
            "value": "some value",
            "column": "some column name"
        }
    ],
    "issue_tag": "some_tag_name"
}

Consequences

By adopting this two-model approach:

This ADR provides a structured and transparent method for managing the migration of problematic audit reports, ensuring accountability and traceability throughout the process.

phildominguez-gsa commented 3 months ago

Thoughts on changing UnvalidatedAuditRecord to InvalidAuditRecord? It seems more clear to me that validations were run, but it failed to validate.

danswick commented 3 months ago

Discussed during dev huddle. Let's consider this accepted. Please feel free to open a PR now, @sambodeme!