Closed megin1989 closed 4 months ago
Unfortunately, there is no unique
property defined for columns in CSVW. (Also no enum
or constraint
properties.)
It looks like you are trying to use DataPackage's TableSchema syntax here. But CSVW is a different specification. The csvw
package only deals with the latter (apart from some basic support for bootstrapping CSVW from DP).
Sorry, I get a note from my friend like this. CSVW (CSV on the Web) metadata provides a way to describe the structure and constraints of CSV files. It has some support for constraints such as unique, enum, and other validation mechanisms.
is this correct? Please explain. We need to apply our various business rules here.
Here's the spec: https://www.w3.org/TR/tabular-metadata/
and here's some description of our csvw
package's (lack of) support for CSVW features: https://github.com/cldf/csvw#known-limitations
I've added a JSON schema below and need to integrate two rules:
To achieve this, I've added "primaryKey": ["PAT_MRN_ID", "FACILITY_ID"] and set FACILITY_ADDRESS1 to be unique. Is there anything else we can do in the JSON schema? Please help.
{ "@context": "http://www.w3.org/ns/csvw", "tables": [ { "url": "data/QE_ADMIN_DATA_qcs-test-20240603-testcase4.csv", "tableSchema": { "columns": [ { "name": "PAT_MRN_ID", "titles": "PAT_MRN_ID", "datatype": "string", "required": true, "constraints": { "unique": true } }, { "name": "FACILITY_ID", "titles": "FACILITY_ID", "datatype": "string", "required": true, "constraints": { "unique": true } }, { "name": "FACILITY_LONG_NAME", "titles": "FACILITY_LONG_NAME", "datatype": "string", "required": true, "pattern": "^[a-zA-Z\s]+$" }, { "name": "ORGANIZATION_TYPE", "titles": "ORGANIZATION_TYPE", "datatype": "string", "required": true, "enum": [ "prov", "dept", "team", "govt", "ins", "pay", "edu", "reli", "crs", "cg", "bus", "other", "laboratory", "imaging", "pharmacy", "health-information-network", "health-data-aggregator", "Hospital", "DTC", "SNF", "SCN", "CBO", "OMH", "OASAS", "Practice", "Article 36", "Article 40", "MCO" ] }, { "name": "FACILITY_ADDRESS1", "titles": "FACILITY_ADDRESS1", "datatype": "string", "required": true, "constraints": { "pattern": "^(?!\sP\.?O\.?\sBox\s*\d+)[a-zA-Z0-9\s]+$", "unique": true } }, { "name": "FACILITY_ADDRESS2", "titles": "FACILITY_ADDRESS2", "datatype": "string" }, { "name": "FACILITY_CITY", "titles": "FACILITY_CITY", "datatype": "string", "required": false, "constraints": { "enum": ["New York City", "Albany", "Buffalo", "Rochester", "Yonkers"] // Update with actual cities in New York State } }, { "name": "FACILITY_STATE", "titles": "FACILITY_STATE", "datatype": "string", "enum": ["New York", "NY"] }, { "name": "FACILITY_ZIP", "titles": "FACILITY_ZIP", "datatype": "string", "required": true, "pattern": "^\d{5}(-\d{4})?$" }, { "name": "VISIT_PART_2_FLAG", "titles": "VISIT_PART_2_FLAG", "datatype": "string", "required": true, "enum": ["Yes", "No"] }, { "name": "VISIT_OMH_FLAG", "titles": "VISIT_OMH_FLAG", "datatype": "string", "required": true, "enum": ["Yes", "No"] }, { "name": "VISIT_OPWDD_FLAG", "titles": "VISIT_OPWDD_FLAG", "datatype": "string", "required": true, "enum": ["Yes", "No"] } ], "primaryKey": ["PAT_MRN_ID", "FACILITY_ID"] }, "dialect": { "delimiter": "|" } } ] }