digital-preservation / csv-validator

CSV Validation Tool and API (CSV Schema RI)
http://digital-preservation.github.io/csv-validator
Mozilla Public License 2.0
202 stars 54 forks source link

Custom Validation for CSV Fields #502

Open megin1989 opened 3 months ago

megin1989 commented 3 months ago

How can I perform custom validation on a CSV file?

Example: Validate that all combinations of the ENCOUNTER_CLASS_CODE and ENCOUNTER_CLASS_CODE_DESCRIPTION fields in the primary CSV file exactly match the data given in the validation file. The check should ensure that each pair of ENCOUNTER_CLASS_CODE and ENCOUNTER_CLASS_CODE_DESCRIPTION in the primary CSV exists in the validation .CSVS file using regular expressions.

version 1.0 @totalColumns 2 @separator ',' ENCOUNTER_CLASS_CODE: regex("(?i)^(AMB|EMER|FLD|HH|IMP|ACUTE|NONAC|OBSENC|PRENC|SS|VR)$") ENCOUNTER_CLASS_CODE_DESCRIPTION: regex("(?i)^(ambulatory|emergency|field|home health|inpatient encounter|inpatient acute|inpatient non-acute|observation encounter|pre-admission|short stay|virtual)$")

DavidUnderdown commented 3 months ago

There's no need to use regex. I would use the any on ENCOUNTER_CLASS_CODE and switch schema rules with is on ENCOUNTER_CLASS_CODE_DESCRIPTION. Use version 1.1 or higher rather than 1.0

version 1.1 @totalColumns 2 @Separator ',' ENCOUNTER_CLASS_CODE: any("AMB","EMER","FLD","HH","IMP","ACUTE","NONAC","OBSENC","PRENC","SS","VR") ENCOUNTER_CLASS_CODE_DESCRIPTION: switch(($ENCOUNTER_CLASS_CODE\is("AMB"),is("ambulatory")),($ENCOUNTER_CLASS_CODE\is("EMER"),is("emergency")),($ENCOUNTER_CLASS_CODE\is("FLD"),is("field")),($ENCOUNTER_CLASS_CODE\is("HH"),is("home")), ($ENCOUNTER_CLASS_CODE\is("IMP"),is("inpatient encounter")),($ENCOUNTER_CLASS_CODE\is("ACUTE"),is("inpatient acute")),($ENCOUNTER_CLASS_CODE\is("ACUTE"),is("inpatient non-acute")),($ENCOUNTER_CLASS_CODE\is("OBSENC"),is("observation encounter")),($ENCOUNTER_CLASS_CODE\is("PRENC"),is("pre-admission")),($ENCOUNTER_CLASS_CODE\is("SS"),is("short stay"),($ENCOUNTER_CLASS_CODE\is("VR"), is("virtual")))

should do what you're after if I've understood you correctly.

megin1989 commented 3 months ago

Thank you for the information. I also need to consider case sensitivity in this validation. The CSV files I receive may not be case-sensitive.