Open aclum opened 5 months ago
Brodie samples (gold:Gs0135149) name,depth.has_numeric_value,id "Soil microbial communities from the East River watershed near Crested Butte, Colorado, United States - ER_145",5,igsn:IEWFS000I "Soil microbial communities from the East River watershed near Crested Butte, Colorado, United States - ER_147",15,igsn:IEWFS000K "Soil microbial communities from the East River watershed near Crested Butte, Colorado, United States - ER_135",15,igsn:IEWFS000B "Soil microbial communities from the East River watershed near Crested Butte, Colorado, United States - ER_134",5,igsn:IEWFS000A "Soil microbial communities from the East River watershed near Crested Butte, Colorado, United States - ER_146",5,igsn:IEWFS000J
Was the decision to REQUIRE depth in meters??? So, we should just show Depth, meters in the UI? Or do we need to put a unit on the value in the database?
@aclum @turbomam
To add, we decided in UCUM, which we haven't implemented... but that's m
right?
See below.. is it m or meter? cs_code or name? Did we make that decision? ![Uploading Screenshot 2024-07-03 at 11.18.51 AM.png…]()
Odd....
https://data.microbiomedata.org/details/sample/nmdc:bsm-11-yqhjes36 has " , meters" in depth
But
https://data.microbiomedata.org/details/sample/nmdc:bsm-11-bsf8yq62 doesn't have a unit...
How did that happen!?
GOLD vs not?
I assume change sheet is the best way to fix this?
Decision, the metadata should be complete. How the UI displays it does not limit what we store. Need to add 'meters' to the resultsdepthhas_unit slot for these samples.
@bmeluch could you help make a change sheet? We can chat about it Tuesday
Any new changes should use 'm' to be more consistent with UCUM.
Any new changes should use 'm' to be more consistent with UCUM.
I agree on consistency and don't have any objection to UCUM's m
.
We can a report of the current Biosample.depth
s with something like this:
wget -O biosample_depths.json \
"https://api.microbiomedata.org/nmdcschema/biosample_set?max_page_size=9999&projection=depth"
jq \
-r '.resources[] | [.id, .depth.has_raw_value, .depth.has_numeric_value, .depth.has_unit] | @tsv' \
biosample_depths.json > biosample_depths.tsv
cut -d $'\t' -f4 biosample_depths.tsv | sort | uniq -c
2347 4737 m 357 meter 681 meters 60 metre
I don't have any trick for doing something like that for the submissions in the submission portal
These are the counts from a mongo query.
@JamesTessmer @mslarae13 can you fix 1000 soils (
nmdc:sty-11-28tm5d36
). I believe these should all be in meters. Forgold:Gs0135149
these are EMSL only samples @mslarae13 can you check these out.nmdc:sty-11-hdd4bf83
, this is TRiP. We can update the unit here to meters b/c has_numeric_value for all of these is 0.We should think about if we want to store and display values of 0. It doesn't make much sense to store depth for animal host-associated samples.