Closed ssarrafan closed 2 years ago
Brandon, please add T-Shirt size and any questions. Team is trying to get an idea of the time/effort to do this.
Please provide information/documentation about how to know if a term comes from MIxS. Is this consumable from code, or will we need to maintain own mappings? Where should we be looking for this info?
Thanks.
@ssarrafan this should be reassigned to Mark M. and Bill. Mark is currently working with Montana to identify sources of each term (where they came from) for display on the metadata submission interface. Once that information is available, hopefully Bill can include that information in the schema so that Brandon can also display it via the portal.
@pvangay ok I've assigned this to Bill and Mark. Is the expectation that this will be done this month or can I move this to the January sprint?
good question for @turbomam
there are a variety of ways to programmatically extract this from the schema
I can advise but need more information about the overall dataflow. I am assuming for UI purposes you will want a ready-made json blob containing all metadata about a field including source, description, hyperlinks for more info etc. Our libs for doing this are python but we can easily precompile json for you.
I'm already consuming nmdc-schema
repo as a git submodule, so any JSON file that exists in that repo is something I can grab and use. Other kinds of data (xml, yaml) would probably also be OK, but JSON is preferable.
@subdavis The mixs
are in the mixs.yaml
located in the directory here:
https://github.com/microbiomedata/nmdc-schema/tree/main/src/schema
You can load the yaml directly yourself and convert to json or I can add a util to do this. What do you prefer?
@wdduncan can this issue be closed?
@ssarrafan I do not know. What do you think @subdavis ?
Sorry, I'm late to the game.
Where should it be indicated that a term comes from MIxS?
If a term is to be used in the NMDC DataHarmonizer, it will be marked with a disposition
of borrowed
or use as-is
on the mixs_packages_x_slots tab of Soil-NMDC-Template_Compiled
Slots/columns that are modification of a MIxS slot appear in mixs_modified_slots
I will be proposing a new structure for this Google Sheet soon, so some of that may become moot.
In terms of how the terms appear in DataHarmonizer, that will be determined by the section
column in those two sheets. I believe @mslarae13 is assigning the MIxS as-is, borrowed and modified terms to DH section whose names will indicate which terms "come from" MIxS. @sujaypatil96 and I are working on the section assignment now.
Should be a small amount of effort. Also, I won't be able to directly map lat_lon
to latitude
and longitude
so we should talk about what sort of interventions are needed for edge cases like that.
Based on the recent comments I will move this one to February. @turbomam and @wdduncan let me know if it should be in the backlog or assigned to someone else.
@subdavis I can create a json file on the github repo, or you can convert it yourself. Just let me know which prefer. As for the lat_lon issue, I don't know what best solution is. @dehays perhaps we can discuss this at the metadata meeting. @subdavis It would be helpful if you could attend the meeting too.
I am really late to this game! But saw the message in slack & checked this out. Is this for the data harmonizer or read the docs / schema definitions?
See work discussed in this ticket https://github.com/microbiomedata/nmdc-schema/issues/252
There are roughly 100 elements in src/schema/mixs.yaml
that already have an in_subset
, like environment
for elev
what are the consequences of assigning more than one subset?
Syntactically, in_subset is multivalued
@wdduncan and @turbomam can we close this issue? Seems like work is being tracked under the nmdc-schema#252?
Removed @wdduncan per his note.
HI Set. Here is an update: https://github.com/microbiomedata/nmdc-schema/issues/134 This is an ongoing issue that will need to be passed on to Mark or Sujay (not sure which). https://github.com/microbiomedata/nmdc-runtime/issues/89 I updated the comment on this. I should be able to get the change sheet edit done before I leave. https://github.com/microbiomedata/nmdc-schema/issues/195 I am working with Sujay on this. I should be able to close before the week’s end. https://github.com/microbiomedata/nmdc-runtime/issues/46 This is an ongoing issue that will need to be passed on to Mark or Sujay (not sure which). https://github.com/microbiomedata/nmdc-server/issues/555 This is a lot of conversation in this thread. But, it looks like Mark has taken this one over.
I think (but haven't proven yet) that all MIxS slots in https://github.com/microbiomedata/nmdc-schema/blob/issue-291-mixs-submod/src/schema/mixs_6_for_nmdc.yaml are annotated as follows:
from_schema: http://w3id.org/mixs/terms
Having said that,
mixs_6_for_nmdc.yaml
hasn't been merged into the main branch yetI'm pretty sure that those annotations appear in @sujaypatil96's new-ish gen-linkml
JSON, but I haven't confirmed yet.
But the from_schema
may change after further imports/merges.
Yes, switch to source
Solution in https://github.com/microbiomedata/nmdc-schema/pull/292
closing this issue in anticipation of a merge in May 2022
Indicate when a term is from MIxS
Related to #448