microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
26 stars 8 forks source link

Limit regex-anchoring carets (`^`) to LinkML `settings`, not `structured_pattern`s #2098

Open turbomam opened 1 week ago

turbomam commented 1 week ago

Regular expression patterns can be left anchored with ^ and/or right anchored with $. Anything that comes before or after one of those anchors invalidates otherwise acceptable strings.

LinkML validates strings with the pattern metaslot on SlotDefinitions. Because some patterns are a pain to write, LinkML also offers structured_patterns which can be composed from LinkML settings. At this point in time, structured_patterns can be converted into patterns with the gen-linkml command in --materialize-patterns mode.

Thew LinkML language doesn't take any stance on whether anchors should go inside of settings as below, or whether they should go in the structured_pattern.syntax

This PR does take a stand: anchors should go in settings, not in structured_pattern.syntax, and it fixes existing structured_pattern.syntax to follow that rule.

github-actions[bot] commented 1 week ago

PR Preview Action v1.4.7 :---: :rocket: Deployed preview to https://microbiomedata.github.io/nmdc-schema/pr-preview/pr-2098/ on branch gh-pages at 2024-07-01 21:53 UTC

eecavanna commented 1 week ago

Hi @turbomam, will you add a description to this PR?

Here's a skeleton:

I think that'll make it easier for all the reviewers to review the changes.

eecavanna commented 1 week ago

I got some additional context by reading the attached GitHub Issue: https://github.com/microbiomedata/nmdc-schema/issues/2092

I'm approving this PR based on the assumption that the changes in this PR fix that issue.

turbomam commented 1 week ago

@eecavanna

will you add a description to this PR

Done.

ssarrafan commented 18 hours ago

@turbomam can this be closed by tomorrow?