microbiomedata / mixs-6-2-release-candidate

Proposed, Harmonized MIxS 6.2
https://github.com/GenomicsStandardsConsortium/mixs6.2_release_candidate
MIT License
5 stars 0 forks source link

Required terms must have examples and preferably validation constraints #45

Closed turbomam closed 1 year ago

turbomam commented 1 year ago

generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'api' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'basin' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'coll_site_geo_feat' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'collection_date' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'env_broad_scale' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'env_local_scale' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'env_medium' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'iwf' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'microbial_biomass_meth' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'occup_density_samp' is a required property generated/mixs_v6.xlsx.examples.yaml::$.exhaustive_test_set[0]: 'water_cut' is a required property

turbomam commented 1 year ago

optionally import the examples from data/ExhaustiveTestClassCollection-example-data.yaml or something more expert-curated

be on the lookout for "example" and "e.g." in descriptions

turbomam commented 1 year ago
select
    value ,
    count(1)
from
    all_attribs aa
where
    harmonized_name = 'coll_site_geo_feat'
group by
    value
having
    count(1) > 1
order by
    count(1) desc ;

no values in the relational version of NCBI's biosample_set:

microbial_biomass_meth

value count
chloroform fumigation extraction 1368
SIR 1146
not applicable 545
not collected 436
PLFA 366
NA 346
gDNA extraction yield (picogreen) per g dry soil 313
Chloroform fumigation-extraction (C in extract/0.45 = MBC) 264
diffusion absorption method 174
- 158
NLFA 16:1ω5c 126
chloroform_fumigation_extraction 120
missing 103
microbial biomass carbon (ug/g soil) 66
microbial_biomass_meth not involve 40
chloroform fumigation-extraction 20
na 19
Chloroform fumigation-incubation 18
0.5M K2SO4 extractions 16
Chloroform fumigation-extraction 15
microbial biomass carbon (g/g soil) 14
mg ITS gene per g frsh soil 12
mg 16S rRNA gene per g frsh soil 12
not collection 6
http://dx.doi.org/10.1016/j.soilbio.2005.01.021 3
http://dx.doi.org/10.1016/j.soilbio.2005.01.016 3
http://dx.doi.org/10.1016/j.soilbio.2005.01.024 3
http://dx.doi.org/10.1016/j.soilbio.2005.01.017 3
http://dx.doi.org/10.1016/j.soilbio.2005.01.045 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.128 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.053 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.048 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.099 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.120 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.028 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.103 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.040 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.124 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.106 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.032 2
http://dx.doi.org/10.1016/j.soilbio.2005.01.101 2
turbomam commented 1 year ago
api:
  name: api
  annotations:
    Preferred_unit:
      tag: Preferred_unit
      value: degrees API
  description: 'API gravity is a measure of how heavy or light a petroleum liquid
    is compared to water (source: https://en.wikipedia.org/wiki/API_gravity) (e.g.
    31.1   API)'
  title: API gravity
  slot_uri: MIXS:0000157
  multivalued: false
  range: string
  required: true
  pattern: ^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)? \S.*\S$
turbomam commented 1 year ago
basin:
  name: basin
  description: Name of the basin (e.g. Campos)
  title: basin name
  slot_uri: MIXS:0000290
  multivalued: false
  range: string
  required: true
turbomam commented 1 year ago
iwf:
  name: iwf
  annotations:
    Preferred_unit:
      tag: Preferred_unit
      value: percent
  description: Proportion of the produced fluids derived from injected water at
    the time of sampling. (e.g. 87%)
  title: injection water fraction
  slot_uri: MIXS:0000455
  multivalued: false
  range: string
  required: true
  pattern: ^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)? \S.*\S$
turbomam commented 1 year ago
occup_density_samp:
  name: occup_density_samp
  description: Average number of occupants at time of sampling per square footage
  title: occupant density at sampling
  slot_uri: MIXS:0000217
  multivalued: false
  range: float
  required: true
turbomam commented 1 year ago
water_cut:
  name: water_cut
  annotations:
    Preferred_unit:
      tag: Preferred_unit
      value: percent
  description: Current amount of water (%) in a produced fluid stream; or the average
    of the combined streams
  title: water cut
  slot_uri: MIXS:0000454
  multivalued: false
  range: string
  required: true
  pattern: ^[-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)? \S.*\S$