EnvironmentOntology / envo

A community-driven ontology for the representation of environments
http://www.environmentontology.org
Creative Commons Zero v1.0 Universal
132 stars 51 forks source link

schema YAML files with `slot_usages` #1469

Closed turbomam closed 7 months ago

turbomam commented 10 months ago
grep -r -c slot_usage src/schema | grep -v ':0'
src/schema/prov.yaml:1
src/schema/annotation.yaml:3
src/schema/core.yaml:11
src/schema/nmdc.yaml:13
src/schema/workflow_execution_activity.yaml:13

slot attributes modified:

turbomam commented 10 months ago

src/schema/prov.yaml:1

Activity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:act-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
turbomam commented 10 months ago

src/schema/annotation.yaml

GenomeFeature:
  slot_usage:
    seqid:
      required: true
    type:
      range: OntologyClass
      description: A type from the sequence ontology
    start:
      required: true
    end:
      required: true
Pathway:
  slot_usage:
    has_part:
      range: Reaction
      required: true
      description: >-
        A pathway can be broken down to a series of reaction step
FunctionalAnnotation:
  slot_usage:
    has_function:
      notes:
        - this slot had been called id
        - "Still missing patterns for COG and RetroRules."
        - "These patterns aren't tied to the listed prefixes. A discussion about that possibility had been started, including the question of whether these lists are intended to be open examples or closed"
    type:
      range: OntologyClass
      description: TODO
    was_generated_by:
      description: provenance for the annotation.
      notes: To be consistent with the rest of the NMDC schema we use the PROV annotation model, rather than GPAD
      range: MetagenomeAnnotationActivity
turbomam commented 10 months ago

src/schema/core.yaml

ProcessedSample:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
AnalyticalSample:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:ansm-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
Site:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:site-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
PlannedProcess:
  slot_usage:
    designated_class:
      comments:
        - required on all instances in a polymorphic Database slot like planned_process_set
OntologyClass:
  slot_usage:
    id:
      pattern: '^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$'
AttributeValue:
  slot_usage:
    type:
      description: An optional string that specified the type of object.
QuantityValue:
  slot_usage:
    has_raw_value:
      description: Unnormalized atomic string representation, should in syntax {number} {unit}
    has_unit:
      description: The unit of the quantity
    has_numeric_value:
      description: The number part of the quantity
      range: double
PersonValue:
  slot_usage:
    orcid:
      annotations:
        display_hint: Open Researcher and Contributor ID for this person. See https://orcid.org
    email:
      annotations:
        display_hint: Email address for this person.
    has_raw_value:
      description: The full name of the Investigator in format FIRST LAST.
      notes:
        - May eventually be deprecated in favor of "name".
    name:
      description: >-
        The full name of the Investigator.
        It should follow the format FIRST [MIDDLE NAME| MIDDLE INITIAL] LAST, where MIDDLE NAME| MIDDLE INITIAL is optional.
      annotations:
        display_hint: First name, middle initial, and last name of this person.
ProteinQuantification:
  slot_usage:
    best_protein:
      description: the specific protein identifier most correctly grouped to its associated peptide sequences
    all_proteins:
      description: the grouped list of protein identifiers associated with the peptide sequences that were grouped to a best protein
ControlledIdentifiedTermValue:
  slot_usage:
    term:
      required: true
GeolocationValue:
  slot_usage:
    has_raw_value:
      description: The raw value for a geolocation should follow {latitude} {longitude}
    latitude:
      required: true
    longitude:
      required: true
turbomam commented 10 months ago

src/schema/nmdc.yaml

The following slot_usages are currently commented out. Everything else in this issue is active

Pooling:
  slot_usage:
    has_input:
      minimum_cardinality: 2
    has_output:
      minimum_cardinality: 1
      maximum_cardinality: 1
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:poolp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
Extraction:
  slot_usage:
    has_input:
      required: true
    has_output:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:extrp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
LibraryPreparation:
  slot_usage:
    has_input:
      required: true
    has_output:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:libprp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
FieldResearchSite:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:frsite-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
CollectingBiosamplesFromSite:
  slot_usage:
    has_input:
      range: Site
      required: true
    has_output:
      range: Biosample
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:clsite-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
DataObject:
  slot_usage:
    name:
      required: true
    description:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:dobj-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
BiosampleProcessing:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:bsmprc-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
    has_input:
      range: Biosample
SubSamplingProcess:
  slot_usage:
    volume:
      description: The output volume of the SubSampling Process.
    mass:
      description: The output mass of the SubSampling Process.
    has_input:
      any_of:
        - range: Biosample
        - range: ProcessedSample
    has_output:
      range: ProcessedSample
      description: The subsample.
MixingProcess:
    slot_usage:
      volume:
        description: The volume of sample filtered.
turbomam commented 10 months ago

src/schema/workflow_execution_activity.yaml

WorkflowExecutionActivity:
  slot_usage:
    started_at_time:
      required: true
    ended_at_time:
      required: true
    git_url:
      required: true
    has_input:
      required: true
    has_output:
      required: true
    execution_resource:
      required: true
    type:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wf-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeAssembly:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmgas-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeAssembly:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmtas-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeAnnotationActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmgan-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeAnnotationActivity
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmtan-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MagsAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmag-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeSequencingActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmsa-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
ReadQcAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfrqc-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
ReadBasedTaxonomyAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfrbt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetabolomicsAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmb-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetaproteomicsAnalysisActivity:
  slot_usage:
    used:
      description: The instrument used to collect the data used in the analysis
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
NomAnalysisActivity:
  slot_usage:
    used:
      range: string
      description: The instrument used to collect the data used in the analysis
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfnom-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
turbomam commented 9 months ago

oops, this is for some other repo that I work in. will move soon.

pbuttigieg commented 8 months ago

Thanks - was confused

turbomam commented 8 months ago

shoot, I don't think I can move this issue out of this org. I will just copy and paste and then delete here.