chop-dbhi / data-models-service

Service for consuming files in the data models format.
1 stars 1 forks source link

Not nullable vs. required #19

Open bruth opened 8 years ago

bruth commented 8 years ago

See https://github.com/chop-dbhi/data-models-service/issues/18 for context.

The required field in the data model definition was derived from the ETL conventions document which declares whether a field is required for submission by sites. This is independent of the not null constraint since you could required a field that is nullable in the data model. If these two properties are now interchangeable then we could alleviate the divergence problem by generating the not_nulls.csv file from the definition files rather than maintaining them by hand.

murphyke commented 8 years ago

Agreed. To me it seems potentially useful to keep the separate not null constraints. I was looking at the not null constraints and interpreted the required attribute to mean "this column must be provided and populated, although values may be NULL if allowed by the constraints (governed by the ETL conventions)". It seems that the data-models-validator is interpreting required=True to mean not null, which doesn't seem correct to me.

In the particular case at hand, the ETL conventions are not crystal clear, e.g. drug_exposure_start_time is 'required' but sites should "provide if available", and yet, "if there is no time associated with the date assert midnight for the start time", implying that it should not be NULL. To me this says that this field should also have a not null constraint.