GenomicsStandardsConsortium / mixs

Minimum Information about any (X) Sequence” (MIxS) specification
https://w3id.org/mixs
Creative Commons Zero v1.0 Universal
38 stars 21 forks source link

add intersction slots to `extension_slot_diffrences` report #829

Closed turbomam closed 3 weeks ago

turbomam commented 3 months ago

Currently,

soil-vs-water-slot-usage.yaml: src/mixs/schema/mixs.yaml
    $(RUN) extension-differences \
        --schema $< \
        --ext1 Soil \
        --ext2 Water > $@ only reutnrs

only returns

Soil_only:
- agrochem_addition
- al_sat
- al_sat_meth
- annual_precpt
- annual_temp
- crop_rotation
- cur_land_use
- cur_vegetation
- cur_vegetation_meth
- drainage_class
- extreme_event
- fao_class
- fire
- flooding
- heavy_metals
- heavy_metals_meth
- horizon_meth
- link_addit_analys
- link_class_info
- link_climate_info
- local_class
- local_class_meth
- micro_biomass_meth
- microbial_biomass
- ph_meth
- pool_dna_extracts
- prev_land_use_meth
- previous_land_use
- profile_position
- season_precpt
- season_temp
- sieving
- slope_aspect
- slope_gradient
- soil_horizon
- soil_texture
- soil_texture_meth
- soil_type
- soil_type_meth
- store_cond
- tillage
- tot_nitro_cont_meth
- tot_nitro_content
- tot_org_c_meth
- tot_org_carb
- water_cont_soil_meth
- water_content
Water_only:
- alkalinity
- alkalinity_method
- alkyl_diethers
- aminopept_act
- ammonium
- atmospheric_data
- bac_prod
- bac_resp
- bacteria_carb_prod
- biomass
- bishomohopanol
- bromide
- calcium
- carb_nitro_ratio
- chem_administration
- chloride
- chlorophyll
- conduc
- density
- diether_lipids
- diss_carb_dioxide
- diss_hydrogen
- diss_inorg_carb
- diss_inorg_nitro
- diss_inorg_phosp
- diss_org_carb
- diss_org_nitro
- diss_oxygen
- down_par
- fluor
- glucosidase_act
- light_intensity
- magnesium
- mean_frict_vel
- mean_peak_frict_vel
- n_alkanes
- nitrate
- nitrite
- nitro
- org_carb
- organism_count
- oxy_stat_samp
- part_org_carb
- part_org_nitro
- perturbation
- petroleum_hydrocarb
- phaeopigments
- phosphate
- phosplipid_fatt_acid
- photon_flux
- potassium
- pressure
- primary_prod
- redox_potential
- salinity
- samp_store_dur
- samp_store_loc
- samp_store_temp
- silicate
- size_frac_low
- size_frac_up
- sodium
- soluble_react_phosp
- sulfate
- sulfide
- suspend_part_matter
- tidal_stage
- tot_depth_water_col
- tot_diss_nitro
- tot_inorg_nitro
- tot_nitro
- tot_part_carb
- tot_phosp
- turbidity
- water_current
turbomam commented 3 months ago

especially in support of NMDC ECO-Fab collaboration @mslarae13

turbomam commented 3 months ago

Here's the proposed output illustrated with MiscellaneousNaturalOrArtificialEnvironment and PlantAssociated

MiscellaneousNaturalOrArtificialEnvironment_only:
- alkalinity
- alt
- ammonium
- biomass
- bromide
- calcium
- chloride
- chlorophyll
- density
- diether_lipids
- diss_carb_dioxide
- diss_hydrogen
- diss_inorg_carb
- diss_org_nitro
- diss_oxygen
- nitrate
- nitrite
- nitro
- org_carb
- org_matter
- org_nitro
- ph
- phosphate
- phosplipid_fatt_acid
- potassium
- pressure
- silicate
- sodium
- sulfate
- sulfide
- water_current
PlantAssociated_only:
- air_temp_regm
- ances_data
- antibiotic_regm
- biol_stat
- biotic_regm
- chem_mutagen
- climate_environment
- cult_root_med
- fertilizer_regm
- fungicide_regm
- gaseous_environment
- genetic_mod
- gravity
- growth_facil
- growth_habit
- growth_hormone_regm
- herbicide_regm
- host_age
- host_common_name
- host_disease_stat
- host_dry_mass
- host_genotype
- host_height
- host_length
- host_life_stage
- host_phenotype
- host_subspecf_genlin
- host_symbiont
- host_taxid
- host_tot_mass
- host_wet_mass
- humidity_regm
- light_regm
- mechanical_damage
- mineral_nutr_regm
- non_min_nutr_regm
- pesticide_regm
- ph_regm
- plant_growth_med
- plant_product
- plant_sex
- plant_struc
- radiation_regm
- rainfall_regm
- root_cond
- root_med_carbon
- root_med_macronutr
- root_med_micronutr
- root_med_ph
- root_med_regl
- root_med_solid
- root_med_suppl
- salt_regm
- samp_capt_status
- samp_dis_stage
- season_environment
- standing_water_regm
- tiss_cult_growth_med
- water_temp_regm
- watering_regm
intersection:
- chem_administration
- depth
- elev
- misc_param
- organism_count
- oxy_stat_samp
- perturbation
- project_name
- salinity
- samp_name
- samp_store_dur
- samp_store_loc
- samp_store_temp
- samp_vol_we_dna_ext
- temp
turbomam commented 3 months ago
poetry run extension-differences \
        --schema src/mixs/schema/mixs.yaml \
        --ext1 MiscellaneousNaturalOrArtificialEnvironment \
        --ext2 PlantAssociated > soil-vs-water-slot-usage.yaml
lschriml commented 3 months ago

This sounds similar to our previous work to define profiles for specific use cases.I fully support mix and matching terms to be included in a new profile.Is that what this ticket is about?Or is this a suggestion of what terms should be included in a particular extension?It is good to keep in mind that the MIxS content does not get changed for specific use cases. That kind of work is not part of the CIG/TWG.Rather, the MIXS terms can be used ‘mix and match’ for projects.If this is for a non GSC project, perhaps the conversation needs to be in that projects GitHub. Cheers,LynnSent from my iPhoneOn Jul 19, 2024, at 2:27 PM, Mark Andrew Miller @.***> wrote: poetry run extension-differences \ --schema src/mixs/schema/mixs.yaml \ --ext1 MiscellaneousNaturalOrArtificialEnvironment \ --ext2 PlantAssociated > soil-vs-water-slot-usage.yaml

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

turbomam commented 3 months ago

extension-differences is an existing discovery tool in the MIxS repo. It isn't specific to any external project and it was not intended to play any role in profiling.

When @mslarae13 and I were meeting with some EcoFAB scientists recently I was trying to show which slots are available for MiscellaneousNaturalOrArtificialEnvironment vs PlantAssociated. I forgot that extension-differences doesn't currently report the terms that are applicable to both queried Extensions, so this issue and it's corresponding PR add that functionality in.

The intention here is to help a new MIxS user make a data-driven choice about what Extension they want to use, or whether they will need to take a "mix and match" approach