ut-effectiveness / utValidateR

Validate Data Elements in Warehouse and Reporting Pipelines
Other
0 stars 0 forks source link

Rule S16a test data discrepancy #9

Open mhagemann-eab opened 2 years ago

mhagemann-eab commented 2 years ago
library(utValidateR)
testdf <- get_test_data(file = "student")
#> Warning in check_expected_values(., colname = expected_value_column): The following rows of test data have bad expected values and were removed: 62, 63, 64, 65, 66
#> Warning in check_rule_names(., checklist = checklist, colname = rule_name_column): The following rows of test data have bad rule names and were removed: 25, 42, 43
knitr::kable(compare_rule_output("S16a", testdf = testdf))
csv_row rule description expr primary_major_cip_code expected actual
2 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 93947 pass fail
3 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 42064 pass fail
4 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 43531 pass fail
5 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 53795 pass fail
6 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 33969 pass fail
7 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 48213 pass fail
8 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 12993 pass fail
9 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 37268 pass fail
10 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 58956 pass fail
11 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 62747 pass fail
12 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 42030 pass fail
13 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 15670 pass fail
14 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 14980 pass fail
15 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 32514 pass fail
16 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 14628 pass fail
17 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 80708 pass fail
18 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 6414 pass fail
19 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 6087 pass fail
20 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 51719 pass fail
21 S16a missing cip code is_valid_values(primary_major_cip_code, valid_cip_codes) 56951 pass fail

Created on 2022-09-20 by the reprex package (v2.0.1)

mhagemann-eab commented 2 years ago

This one could be hard to resolve, since the rule checks against a list of valid cip codes, and the test data generates them randomly. I suggest treating this as low-priority and see if I can just skip the tests for this rule.

mhagemann-eab commented 2 years ago

Rule S37a is the same issue.

andreabringhurst commented 2 years ago

Made changes in the csv file, these should pass now.

mhagemann-eab commented 2 years ago

Much better now, but 2 are still failing--both have 999999 as the cip code. Should this be a valid code? If so I can add it to the list of valid cip codes.

andreabringhurst commented 2 weeks ago

I don't think this has been resolved. I was creating a file for bad data so we could test if the rule was working. We will need to determine if I need to go in and work on this file.