microbiomedata / sheets_and_friends

Enhance a LinkML model with imported and optionally modified slots
0 stars 0 forks source link

add `pattern` property to generated NMDC linkml schema #115

Closed sujaypatil96 closed 2 years ago

sujaypatil96 commented 2 years ago

This PR seeks to address issue #104.

Approach: The new modifications_and_validation entry point consumes the validation_converter sheet, and implements the validation rules logic that populates the schema with the pattern attribute in addition to the logic that works on the modifications_long sheet.


To test:

make all
netlify[bot] commented 2 years ago

Deploy Preview for voluble-pika-79eed4 canceled.

Name Link
Latest commit b8ef2bdea989c8da6bc89a7da30ef498cbbf8984
Latest deploy log https://app.netlify.com/sites/voluble-pika-79eed4/deploys/6258af3a517ab5000842f97b
sujaypatil96 commented 2 years ago

@turbomam: this PR is ready for review.

turbomam commented 2 years ago

Was this was based on the main branch of sheets_and_friends? I can see why you would have done that, but my intention is to use this new contribution in issue-100-netlify-linkml-datastructure

I have checked out spatil/add-regex-validators but can't do make docs/template/nmdc_dh/schema.js

make: *** No rule to make target 'docs/template/nmdc_dh/schema.js'. Stop.

I locally merged issue-100-netlify-linkml-datastructure into spatil/add-regex-validators and ran make docs/template/nmdc_dh/schema.js

that completed and built a docs/linkml.html. I opened the nmdc_dh/soil_emsl_jgi_mg template and loaded the sample data file and ran the validation

sample_shipped

aka "sample shipped amount"

looks good: requires a float followed by some text (for the unit)

depth

aka "depth, meters" should take a float or a range of floats

"string_serialization": "{float}|{float}-{float}",

but it got the pattern for floats with units

"pattern": "^[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)? \\S+$",

it's supposed to get this instead:

^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$|^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?-[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?$

source_mat_id

aka "globally unique ID" should require a CURIE

"string_serialization": "{text}:{text}",

but it didn't get a pattern, so it's taking any unique values

it's supposed to get: [^\:\n\r]+\:[^\:\n\r]+

maybe that could be improved

turbomam commented 2 years ago

It's not complaining about pH values of 999 either. But that's not a regex validation.