cfpb / sbl-project

Project management repo for the SBL project
Creative Commons Zero v1.0 Universal
2 stars 1 forks source link

Data Validator Parsing "Bug" #198

Closed sthomas93 closed 6 months ago

sthomas93 commented 6 months ago

When parsing through the Phase 1 & 2 files we noticed that the validator was failing float values when entered and commas for the numeric columns when parsing for syntax errors.

Float values

Commas

Additional Parsing

jcadam14 commented 6 months ago

For the Float Values one, this is similar to the "Additional Parsing". Because the FIG calls out values of 1, 988, 999, etc are expected, passing in a value of 1.00 for that field is not allowed. So two are sort of related in that we have very defined terms of what is allowed for certain fields, and while numerical values of 1 and 1.00 are the same, the concept of enumerated lists of acceptable values, it's not.

jcadam14 commented 6 months ago

For Commas, it's easy enough to remove the commas when checking if something is a number or is gt, lt, or gte to allowed values. However, that approach would allow someone to put in '9,99,9999,9.00' and simply stripping out commas would give 99999999.00. Which of course the original value should not be accepted as a valid value. This could be achieved easily enough by beefing up the regex that currently just looks for numbers potentially followed by a decimal followed by two numbers.

Regex should look something like: ^\d{1,3}(,\d{3})*(\.\d{2})?$

sthomas93 commented 6 months ago

Per team discussion: Validator is parsing as intended to not include commas or floats for data entries. Next steps will be for the BA's to update the corresponding test files and prepare language for Tips & Tricks email.