Closed turbomam closed 2 years ago
If we had a feature for allowing several templates to be merged into one, it sounds like that would satisfy what you are talking about? If the different templates shared some columns, those could be merged into a new "shared" section? DataHarmonizer's json data structure for tabular columns should be amenable to this. But the conversion to LinkML touches on this too - LinkML seems to offer the potential to merge n specifications into one.
Yes, that might be sufficient.
Is a typical use-case that a given sample pertains to more than one MIxS package? Therefore a user wants to generate tabular data pertaining to just a few packages? And the remaining unselected packages would just have their fields dropped from managed/outputted data?
@turbomam Mark - I'd suggest beginning with a DH template per MIxS environment package. There are a few differences in required fields and in field semantics between packages. @ddooley 's idea of merging templates (or mixins) might be useful - each MIxS environment package shares those ~10 required fields (geo location + EnvO triad)
I think this is resolved now insofar as both Marks branch https://turbomam.github.io/DataHarmonizer/main.html and the latest DH linkml-datastructure branch show multiple MIxS templates generated from a single linkml set of MIxS files, with lots of field sections visible in them. Ok to close this?
I think the section jumping plus the multiple-template menu from linkml-datastructure meet our needs in NMDC. Thanks.
Could DataHarmonizer support additional column groups?
One application I'm thinking of is support for all of the various MIxS packages in one template. I think there are 10 or 15 now. Maybe that's too many. Maybe I should just create a separate DataHarmonizer template for each package, because they could require different validation rules.