cldf / csvw

CSV on the web
Apache License 2.0
37 stars 6 forks source link

Validation Support #78

Closed megin1989 closed 4 months ago

megin1989 commented 4 months ago

Hi Team,

To add a conditional requirement for the ANSWER_CODE field in JSON schema, where ANSWER_CODE is not required when QUESTION_CODE_DESCRIPTION is one of certain values

Here's how i am achieving this:

  1. Use the if, then, and else keywords to conditionally require the ANSWER_CODE field based on the value of QUESTION_CODE_DESCRIPTION.
  2. Ensure that ANSWER_CODE is only required when QUESTION_CODE_DESCRIPTION is not one of the specified values.

Is this correct or it can be applicable? Please check validations in schema.

{
  "@context": "http://www.w3.org/ns/csvw",
  "tables": [
    {
      "url": "data/SCREENING_qcs-test-20240603-testcase4.csv",
      "tableSchema": {
        "columns": [
          {
            "name": "PAT_MRN_ID",
            "titles": "PAT_MRN_ID",
            "datatype": "string",
            "required": true,
            "constraints": {
              "unique": true
            }
          },
          {
            "name": "FACILITY_ID",
            "titles": "FACILITY_ID",
            "datatype": "string",
            "required": true,
            "constraints": {
              "unique": true
            }
          },   
          {
            "name": "ENCOUNTER_TYPE_CODE_SYSTEM",
            "titles": "ENCOUNTER_TYPE_CODE_SYSTEM",
            "datatype": "string",
            "enum": ["SNOMED-CT", "SNOMED", "http://snomed.info/sct"]
          },
          {
            "name": "SCREENING_STATUS_CODE",
            "titles": "SCREENING_STATUS_CODE",
            "datatype": "string",
            "required": true
          },
          {
            "name": "SCREENING_STATUS_CODE_DESCRIPTION",
            "titles": "SCREENING_STATUS_CODE_DESCRIPTION",
            "datatype": "string"
          },
          {
            "name": "SCREENING_STATUS_CODE_SYSTEM",
            "titles": "SCREENING_STATUS_CODE_SYSTEM",
            "datatype": "string",
            "required": true,
            "enum": ["http://hl7.org/fhir/observation-status"]
          }, 
          {
            "name": "QUESTION_CODE",
            "titles": "QUESTION_CODE",
            "datatype": "string",
            "required": true
          },
          {
            "name": "QUESTION_CODE_DESCRIPTION",
            "titles": "QUESTION_CODE_DESCRIPTION",
            "datatype": "string",
            "required": true
          },
          {
            "name": "QUESTION_CODE_SYSTEM_NAME",
            "titles": "QUESTION_CODE_SYSTEM_NAME",
            "datatype": "string",
            "required": true,
            "enum": ["LN", "LOINC", "http://loinc.org"]
          }, 
          {
            "name": "ANSWER_CODE",
            "titles": "ANSWER_CODE",
            "datatype": "string"
          },
          {
            "name": "ANSWER_CODE_DESCRIPTION",
            "titles": "ANSWER_CODE_DESCRIPTION",
            "datatype": "string",
            "required": true
          },
          {
            "name": "ANSWER_CODE_SYSTEM_NAME",
            "titles": "ANSWER_CODE_SYSTEM_NAME",
            "datatype": "string",
            "required": true,
            "enum": ["LN", "LOINC", "http://loinc.org"]
          }  
        ], 
        ],
        **"validation": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "ANSWER_CODE": {
              "if": {
                "properties": {
                  "QUESTION_CODE_DESCRIPTION": {
                    "enum": [
                      "TOTAL SAFETY SCORE",
                      "CALCULATED PHYSICAL ACTIVITY SCORE",
                      "CALCULATED MENTAL HEALTH SCORE"
                    ]
                  }
                }
              },
              "then": {
                "not": {
                  "required": ["ANSWER_CODE"]
                }
              },
              "else": {
                "required": ["ANSWER_CODE"]
              }
            }
          }
        }**
      },
      "dialect": {
        "delimiter": "|"
      }
    }
  ]
}
xrotwang commented 4 months ago

CSVW metadata documents are JSON-LD documents and as such can contain additional properties, not specified by CSVW. Thus, as long as the result is JSON-LD, you can add all sorts of stuff. The csvw package will only support validation rules specified in CSVW, though.

megin1989 commented 4 months ago

Can I use the transform method in CSVW to apply validation rules defined in a separate JSON file to validate my CSV data? If so, how can I correctly structure this?

"url": "data/SCREENING_qcs-test-20240603-testcase4.csv", "tableSchema": { "columns": [

            {"name": "SCREENING_CODE", "titles": "SCREENING_CODE", "datatype": {"base": "string", "format": "^[A-Z]+$"}, "required": true, "enum": ["96777-8", "97023-6"]},
            {"name": "SCREENING_CODE_DESCRIPTION", "titles": "SCREENING_CODE_DESCRIPTION", "datatype": "string", "required": true, "enum": ["Accountable health communities (AHC) health-related social needs (HRSN) supplemental questions", "accountable health communities (AHC) health-related social needs (HRSN) supplemental questions", "Accountable health communities (AHC) health-related social needs screening (HRSN) tool", "accountable health communities (AHC) health-related social needs screening (HRSN) tool", "NYS AHC HRSN screening"]},
            {"name": "SCREENING_CODE_SYSTEM_NAME", "titles": "SCREENING_CODE_SYSTEM_NAME", "datatype": "string", "required": true, "enum": ["LN", "ln", "LOINC", "loinc", "http://loinc.org", "NYS standard", "NYS Standard"]},
            {"name": "RECORDED_TIME", "titles": "RECORDED_TIME", "datatype": "datetime", "required": true, "pattern": "([0-9]{4})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])T([01][0-9]|2[0-3]):([0-5][0-9]):([0-5][0-9]|60)(\\.[0-9]+)?(Z|(\\+|-)([01][0-9]|2[0-3]):([0-5][0-9]))"},
            {"name": "QUESTION_CODE", "titles": "QUESTION_CODE", "datatype": {"base": "string", "format": "^[A-Z]+$"} , "required": true},
            {"name": "QUESTION_CODE_DESCRIPTION", "titles": "QUESTION_CODE_DESCRIPTION", "datatype": "string", "required": true},
            {"name": "QUESTION_CODE_SYSTEM_NAME", "titles": "QUESTION_CODE_SYSTEM_NAME", "datatype": "string",  "enum": ["LN", "LOINC", "http://loinc.org"]},
            {"name": "UCUM_UNITS", "titles": "UCUM_UNITS", "datatype": "string"},
            {"name": "SDOH_DOMAIN", "titles": "SDOH_DOMAIN", "datatype": "string", "required": true},
            {"name": "PARENT_QUESTION_CODE", "titles": "PARENT_QUESTION_CODE", "datatype": "string"},
            {"name": "ANSWER_CODE", "titles": "ANSWER_CODE", "datatype": {"base": "string", "format": "^[A-Z]+$"}, "required": true},
            {"name": "ANSWER_CODE_DESCRIPTION", "titles": "ANSWER_CODE_DESCRIPTION", "datatype": "string", "required": true},
            {"name": "ANSWER_CODE_SYSTEM_NAME", "titles": "ANSWER_CODE_SYSTEM_NAME", "datatype": "string", "required": true,  "enum": ["LN", "LOINC", "http://loinc.org"]},
            {"name": "POTENTIAL_NEED_INDICATED", "titles": "POTENTIAL_NEED_INDICATED", "datatype": "string", "required": true, "enum": ["Yes", "No", "NA", "yes", "no", "na"]}
          ],

"transform": [ { "targetFormat": "json-schema", "source": "conditional_schema.json" } ]

The conditional_schema.json is below

{ "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "QUESTION_CODE_DESCRIPTION": { "type": "string" }, "ANSWER_CODE": { "type": "string" }, "ANSWER_CODE_SYSTEM_NAME": { "type": "string" }, "QUESTION_CODE": { "type": "string" }, "QUESTION_CODE_SYSTEM_NAME": { "type": "string" }, "SCREENING_CODE_DESCRIPTION": { "type": "string" }, "SCREENING_CODE_SYSTEM_NAME": { "type": "string" }, "SCREENING_CODE": { "type": "string" } }, "required": ["QUESTION_CODE_DESCRIPTION"], "if": { "properties": { "QUESTION_CODE_DESCRIPTION": { "enum": [ "TOTAL SAFETY SCORE", "CALCULATED PHYSICAL ACTIVITY SCORE", "CALCULATED MENTAL HEALTH SCORE", "total safety score", "calculated physical activity score", "calculated mental health score" ] } } }, "then": { "properties": { "ANSWER_CODE": { "type": "string" }, "ANSWER_CODE_SYSTEM_NAME": { "type": "string" }, "QUESTION_CODE": { "type": "string" }, "QUESTION_CODE_SYSTEM_NAME": { "type": "string" }, "SCREENING_CODE_DESCRIPTION": { "type": "string" }, "SCREENING_CODE_SYSTEM_NAME": { "type": "string" }, "SCREENING_CODE": { "type": "string" } } }, "else": { "required": [ "ANSWER_CODE", "ANSWER_CODE_SYSTEM_NAME", "QUESTION_CODE", "QUESTION_CODE_SYSTEM_NAME", "SCREENING_CODE_DESCRIPTION", "SCREENING_CODE_SYSTEM_NAME", "SCREENING_CODE" ] } }

xrotwang commented 4 months ago

That may be an option. But the csvw package does not support any kind of transformation definition, so it would just be a way to inform your own processing software.