Open turbomam opened 8 months ago
This sounds like something that should be discussed in person as it's sounding a bit like a situation where we're talking at cross purposes. And you are correct that it's not been discussed in depth yet
On Thu, 19 Oct 2023, 20:18 Mark A. Miller, @.***> wrote:
In the MIxS 6.1 Excel sheet, terms can have a 'Section' value. That has been implemented with subset definitions https://github.com/GenomicsStandardsConsortium/mixs/blob/a66c92b9d7d68b0bfefd9dacb54081c261af4a9d/src/mixs/schema/mixs.yaml#L24-L29C17 and in_subset https://github.com/GenomicsStandardsConsortium/mixs/blob/a66c92b9d7d68b0bfefd9dacb54081c261af4a9d/src/mixs/schema/mixs.yaml#L3959-L3971 assertions in the v6.2.0 LinkML YAML file.
The MIxS sections were:
- sequencing
- environment
- nucleic acid sequence source
- investigation
In my eye, those provide a useful and actionable grouping of slots. For example, from an NMDC perspective, adapters is not an attribute of a sample, but rather an attribute of a sequencing process. In some data serializations I do understand that it might may be "practical" to bind adapters to a sample in a shortcut relationship.
I don't feel like sections or subsets have been discussed as much as Checklist classes, Extension classes and terms/slots. I don't think we have any records that capture our thoughts as well as these notes about slot attributes https://docs.google.com/document/d/1LzNvt3b09JSNxlf2e2IBwzc33AXe9-mc1GHUR0iBIhg/edit#bookmark=id.gipej8ej40qs .
We have a few ways to group slots thematically in LinkML, like in_subset, is_a (hierarchy) or slot_group. However, if we want to retain grouping like that but don't like the language "subset", then we can we can use a different heading in the documentation pages.
I am asking to discuss this in a TWG or CIG meeting and make some link-able notes, but won't object to postponing it for a while.
Note that I unilaterally added a combination_classes subset to v6.2.0. That's more infrastructural than thematic. I wanted to simplify special handling for combination classes, but for now they can also be distinguished by the fact that they both have is_a relations and use minxins. I'm not convinced that those will always be good differentia. So that might argue for implementing the old sections as slot_groups, which are not used in v6.2.0 and don't have any baggage.
— Reply to this email directly, view it on GitHub https://github.com/GenomicsStandardsConsortium/mixs/issues/679, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOB5GLSBRWXH3NSYRN4U43YAF4JPAVCNFSM6AAAAAA6HWTKEOVHI2DSMVQWIX3LMV43ASLTON2WKOZRHE2TEOJRGE3DINQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Sure, discussing in person is a good idea.
Executive summary:
only1chunts update to this ticket:
There are now #771 and #772 for discussion on the definition of what section means and what the values of section could be. I would like to alter the focus of this ticket to be more technical, i.e. while the CIG refers to this as
section
, the LinkML implementation will probably use something else, and as the original ticket content (below) gives various options for that I want this ticket to focus on that aspect. \ eg.in_subset
,is_a
(hierarchy) orslot_group
.turbomans original ticket content:
In the MIxS 6.1 Excel sheet, terms can have a 'Section' value. That has been implemented with subset definitions and in_subset assertions in the v6.2.0 LinkML YAML file.
The MIxS sections were:
sequencing
environment
nucleic acid sequence source
investigation
In my eye, those provide a useful and actionable grouping of slots. For example, from an NMDC perspective,
adapters
is not an attribute of a sample, but rather an attribute of a sequencing process. In some data serializations I do understand that it might may be "practical" to bindadapters
to a sample in a shortcut relationship.I don't feel like sections or subsets have been discussed as much as
Checklist
classes,Extension
classes and terms/slots
. I don't think we have any records that capture our thoughts as well as these notes about slot attributes.We have a few ways to group slots thematically in LinkML, like
in_subset
,is_a
(hierarchy) orslot_group
. However, if we want to retain grouping like that but don't like the language "subset", then we can we can use a different heading in the documentation pages.I am asking to discuss this in a TWG or CIG meeting and make some link-able notes, but won't object to postponing it for a while.
_Note that I unilaterally added a
combination_classes
subset to v6.2.0_. That's more infrastructural than thematic. I wanted to simplify special handling for combination classes, but for now they can also be distinguished by the fact that they both haveis_a
relations and useminxins
. I'm not convinced that those will always be good differentia. So that might argue for implementing the old sections asslot_groups
, which are not used in v6.2.0 and don't have any baggage.