CDCgov / prime-reportstream

ReportStream is a public intermediary tool for delivery of data between different parts of the healthcare ecosystem.
https://reportstream.cdc.gov
Creative Commons Zero v1.0 Universal
69 stars 39 forks source link

Zip Code Validation #4157

Closed loripusey closed 2 years ago

loripusey commented 2 years ago

Problem Statement

High Level Issue: The CSV uploader is returning confusing error messages that are not helpful for the user to understand the problem with their file. Issue scope: All-in-One Health Sender in Staging using the all-in-one-health-ca-covid-19.schema Example: The all-in-one-health-ca-covid-19 schema uses the mapper function zipCodeToCounty to create the element patient_county by passing the zip code value supplied in the CSV file for each record to the zip-code-data.csv table, and looking up the County that corresponds to that zip code. The cardinality setting of the element patient_county in the COVID-19 base schema is set to One, meaning that the field is required. The cardinality of the patient_zip_code is Zero_Or_One, meaning that it is not required, and will not produce errors or warnings even if the value in that field is blank. Since the patient_zip_code is used to lookup the patient_county (a required field) the patient_zip_code becomes a required field indirectly.

As a result, the error message returned to the user if there is a blank or invalid zip code is a MissingFieldMessage for the Patient_County field. The error message “Blank Value for Element” is returned for the rows that have an invalid zip code under the following identified conditions:

Using manual entry or modification in a spreadsheet editor like Excel, the only way to edit zip codes starting with a “0” is to change the format of the value to text, or to include the character “ ‘ “ before the zip code. Otherwise, Excel will recognize a zip code starting with “0” as a number, and will remove the “0” from the zip code value. This would be particularly problematic in States where all or most zip codes start with a “0.”

Main, Vermont, New Hampshire, Massachusetts, Rhode Island, Connecticut, New Jersey, and Puerto Rico

Detailed Issues Identified: If a zip code value of a single record in a CSV file is blank, not in the zip-code-data.csv table, or is in text format, the user receives a message of “Error: File not Accepted.” In the requested edits, the user receives the message “Blank value for element” for the corresponding row(s) where there was a zip code issue

It is seemingly impossible to get ReportStream to accept a file where the user has to manually modify a zip code starting with “0” in a spreadsheet editor

The message “Blank value for element” does not tell the user which element is blank

In the scenario that there is an invalid zip code, a zip code with a text value, the message returns “blank value for element,” even though there is no blank value in the CSV file

A single record where there is a blank or invalid zip code will result in the entire file failing. This would be frustrating to the user, especially if it is unclear as to what changes the user needs to make. Their only resolution may be to delete the affected row(s) in order to get the file to submit

Refer to Mike's document: https://cdc.sharepoint.com/:w:/r/teams/USDSatCDC/_layouts/15/doc.aspx?sourcedoc=%7B87bf3d7b-1b87-4ee2-ad6c-61d262666153%7D&action=edit

Refer to Rick Hood's email for additional info: https://app.zenhub.com/files/304423150/8a41a04f-3481-4aea-b987-5949cfb768c3/download

To Do

loripusey commented 2 years ago

@rachelhanster I created an Epic just to handle zip code validation and moved the SPIKE ticket here; we probably need to evaluate @TomNUSDS's results from the SPIKE and decide what work we need to do to move us forward

rachelhanster commented 2 years ago

Closing this epic and moved a couple tickets to https://app.zenhub.com/workspaces/experience-607d9d5e68b95200150fec37/issues/cdcgov/prime-reportstream/5065