linkml / prefixmaps

Semantic prefix map registry
https://linkml.io/prefixmaps/
Apache License 2.0
10 stars 3 forks source link

Add data integrity tests #15

Closed cthoyt closed 1 year ago

cthoyt commented 1 year ago

As a step towards addressing #11, this PR adds four data integrity tests:

  1. Test that prefixes are unique among canonical records in each context
  2. Test that namespaces are unique among canonical records in each context
  3. Test that the prefix appearing in each namespace_alias record has a corresponding prefix in a canonical record
  4. Test that the namespace appearing in each prefix record has a corresponding namespace in a canonical record

Most of these tests are failing, so having a second set of eyes on them (@caufieldjh ;)) will be great. We can start updating the content further until these tests pass, either in this PR or in a different one. However, it might also point out some more systematic issues in the ETL pipelines, so I would also suggest @hrshdhgd takes a careful look too.

Blockers

cthoyt commented 1 year ago

I curated all of the remaining issues in Bioportal away in e29c159

caufieldjh commented 1 year ago

Great tests! I want to see if I can get that test_namespace_aliases working - some of those prefixes are already covered by other maps in this repo, and the remainder are instances which only don't have canonical prefixes because they aren't in Bioportal (but are imports)

caufieldjh commented 1 year ago

I think this is probably sufficient for now, though I'll likely have further updates to the BP maps quite soon.

cthoyt commented 1 year ago

@caufieldjh thanks for making the BioPortal updates and getting this up to date! It's now merged with master and done from my side.