microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

do linkml-lint as past of the build process #1778

Open turbomam opened 6 months ago

turbomam commented 6 months ago

How to run in a mode that ignores Permissible value standard_naming?

linkml-lint src/schema/nmdc.yaml  | grep -v 'Permissible value of Enum.*(standard_naming)'

I think we can ignore most of the canonical_prefixes, but we should review that

I think there are still some enums with lower snake case names, but this didn't catch any

@mslarae13 I think there's already an issue for many of these Slot description recommended warning

warning Class 'FailureCategorization' does not have recommended slot 'description' (recommended) warning Class 'FunctionalAnnotationAggMember' does not have recommended slot 'description' (recommended) warning Class 'Protocol' does not have recommended slot 'description' (recommended) warning Class 'LibraryPreparation' does not have recommended slot 'description' (recommended) warning Class 'CollectingBiosamplesFromSite' does not have recommended slot 'description' (recommended) warning Slot 'has_failure_categorization' does not have recommended slot 'description' (recommended) warning Slot 'model' does not have recommended slot 'description' (recommended) warning Slot 'vendor' does not have recommended slot 'description' (recommended) warning Slot 'count' does not have recommended slot 'description' (recommended) warning Slot 'functional_annotation_agg' does not have recommended slot 'description' (recommended) warning Slot 'habitat' does not have recommended slot 'description' (recommended) warning Slot 'location' does not have recommended slot 'description' (recommended) warning Slot 'community' does not have recommended slot 'description' (recommended) warning Slot 'ncbi_taxonomy_name' does not have recommended slot 'description' (recommended) warning Slot 'ncbi_project_name' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_site' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_year' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_month' does not have recommended slot 'description' (recommended) warning Slot 'library_preparation_kit' does not have recommended slot 'description' (recommended) warning Slot 'extraction_method' does not have recommended slot 'description' (recommended) warning Slot 'pcr_cycles' does not have recommended slot 'description' (recommended) warning Slot 'library_type' does not have recommended slot 'description' (recommended) warning Slot 'biosample_categories' does not have recommended slot 'description' (recommended) warning Slot 'relevant_protocols' does not have recommended slot 'description' (recommended) warning Slot 'applied_roles' does not have recommended slot 'description' (recommended) warning Slot 'applies_to_person' does not have recommended slot 'description' (recommended) warning Slot 'field_research_site_set' does not have recommended slot 'description' (recommended) warning Slot 'collecting_biosamples_from_site_set' does not have recommended slot 'description' (recommended) warning Slot 'pooling_set' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_day' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_hour' does not have recommended slot 'description' (recommended) warning Slot 'sample_collection_minute' does not have recommended slot 'description' (recommended) warning Slot 'soluble_iron_micromol' does not have recommended slot 'description' (recommended) warning Slot 'host_name' does not have recommended slot 'description' (recommended) warning Slot 'subsurface_depth' does not have recommended slot 'description' (recommended) warning Slot 'proport_woa_temperature' does not have recommended slot 'description' (recommended) warning Slot 'biogas_temperature' does not have recommended slot 'description' (recommended) warning Slot 'soil_annual_season_temp' does not have recommended slot 'description' (recommended) warning Slot 'biogas_retention_time' does not have recommended slot 'description' (recommended) warning Slot 'completion_date' does not have recommended slot 'description' (recommended) warning Slot 'value' does not have recommended slot 'description' (recommended) warning Enum 'InstrumentModelEnum' does not have recommended slot 'description' (recommended) warning Enum 'InstrumentVendorEnum' does not have recommended slot 'description' (recommended) warning Enum 'StatusEnum' does not have recommended slot 'description' (recommended) warning Enum 'ExtractionTargetEnum' does not have recommended slot 'description' (recommended) warning Enum 'LibraryTypeEnum' does not have recommended slot 'description' (recommended) warning Enum 'JgiContTypeEnum' does not have recommended slot 'description' (recommended) warning Enum 'FileTypeEnum' does not have recommended slot 'description' (recommended) warning Enum 'CreditEnum' does not have recommended slot 'description' (recommended) warning Enum 'StudyCategoryEnum' does not have recommended slot 'description' (recommended) warning Enum 'DoiProviderEnum' does not have recommended slot 'description' (recommended) warning Enum 'DoiCategoryEnum' does not have recommended slot 'description' (recommended) warning Enum 'CompoundEnum' does not have recommended slot 'description' (recommended) warning Schema maps prefix 'CATH' to namespace 'https://bioregistry.io/cath:' instead of namespace 'http://identifiers.org/cath/' (canonical_prefixes) warning Schema maps prefix 'CHEMBL.COMPOUND' to namespace 'https://bioregistry.io/chembl.compound:' instead of namespace 'http://identifiers.org/chembl.compound/' (canonical_prefixes) warning Schema maps prefix 'DRUGBANK' to namespace 'https://bioregistry.io/drugbank:' instead of namespace 'http://identifiers.org/drugbank/' (canonical_prefixes) warning Schema maps prefix 'EFO' to namespace 'http://www.ebi.ac.uk/efo/' instead of namespace 'http://identifiers.org/efo/' (canonical_prefixes) warning Schema maps prefix 'EGGNOG' to namespace 'https://bioregistry.io/eggnog:' instead of namespace 'http://identifiers.org/eggnog/' (canonical_prefixes) warning Schema maps prefix 'HMDB' to namespace 'https://bioregistry.io/hmdb:' instead of namespace 'http://identifiers.org/hmdb/' (canonical_prefixes) warning Schema maps prefix 'MASSIVE' to namespace 'https://bioregistry.io/reference/massive:' instead of namespace 'http://identifiers.org/massive/' (canonical_prefixes) warning Schema maps prefix 'MESH' to namespace 'https://bioregistry.io/mesh:' instead of namespace 'http://identifiers.org/mesh/' (canonical_prefixes) warning Schema maps prefix 'ORCID' to namespace 'https://orcid.org/' instead of namespace 'http://identifiers.org/orcid/' (canonical_prefixes) warning Schema maps prefix 'PANTHER.FAMILY' to namespace 'https://bioregistry.io/panther.family:' instead of namespace 'http://identifiers.org/panther.family/' (canonical_prefixes) warning Schema maps prefix 'PFAM' to namespace 'https://bioregistry.io/pfam:' instead of namespace 'http://identifiers.org/pfam/' (canonical_prefixes) warning Schema maps prefix 'PUBCHEM.COMPOUND' to namespace 'https://bioregistry.io/pubchem.compound:' instead of namespace 'http://identifiers.org/pubchem.compound/' (canonical_prefixes) warning Schema maps prefix 'SUPFAM' to namespace 'https://bioregistry.io/supfam:' instead of namespace 'http://identifiers.org/supfam/' (canonical_prefixes) warning Schema maps prefix 'TIGRFAM' to namespace 'https://bioregistry.io/tigrfam:' instead of namespace 'http://identifiers.org/tigrfam/' (canonicalprefixes) warning Schema maps prefix 'edam.data' to namespace 'http://edamontology.org/data' instead of using prefix 'EDAM.DATA' (canonical_prefixes) warning Schema maps prefix 'rdf' to namespace 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' instead of using prefix 'RDF' (canonical_prefixes) warning Schema maps prefix 'rdfs' to namespace 'http://www.w3.org/2000/01/rdf-schema#' instead of using prefix 'RDFS' (canonical_prefixes) warning Schema maps prefix 'wikidata' to namespace 'http://www.wikidata.org/entity/' instead of using prefix 'wd' (canonical_prefixes)

pkalita-lbl commented 6 months ago

Step away from the grep command.

Define your own configuration that leaves the standard_naming rule off if you don't want it.

turbomam commented 6 months ago

Thanks. I want to check standard naming for everything except PV names. And I think I want even stricter checking of enum names. I will experiment some more and update today or Monday.