nih-cfde / ontologyWG

1 stars 0 forks source link

OBI term needs for Kids First #1

Open mgiglio99 opened 3 years ago

mgiglio99 commented 3 years ago

From Allison Heath and Eric Wenger:

Good morning, quick question on what the Ontology WG recommends for c2m2 OBI id's in several cases where we are not seeing matches. Kids First data has several assays below but no associated OBI match:
Linked-Read WGS (10x Chromium); and Targeted Sequencing

the other assays with matches we do have are below: WGS (Whole Genome Sequencing) is http://purl.obolibrary.org/obo/OBI_0002117 (OBI:0002117) WXS (Whole Exome Sequencing) is http://purl.obolibrary.org/obo/OBI_0002118 (OBI:0002118) miRNA ("small RNA sequencing assay" ) http://purl.obolibrary.org/obo/OBI_0002112 (OBI:0002112)

ericwenger-pm commented 3 years ago

Just an additional question related to the "other" category (i.e. non-matches for OBI id) -- currently on the CFDE Portal there is a category of "other" that is associated with a submission -- was the approach to create a synthetic OBI id or some other identifier and add it as a reference both in the assay_type table? Per specs only a valid OBI ID was acceptable for assay_type in the file table

Example below Image 6-17-21 at 4 56 PM

mgiglio99 commented 3 years ago

Hi Eric, My understanding is that the "Other" category exists only for visualization purposes. Because there can be a large number of individual terms used by various DCCs, it becomes too many to visualize with different colors in a stacked bar chart. So, I think what happens is the portal take the ~12 terms with the most usage and assigns colors to those for display, then groups all of the remaining ones together under "Other". This arrangement is not ideal and is one reason we are developing the ontology slims, so that related terms can be grouped together under more general parents such that the sections in the stacked bars will be more comprehensive and there won't be an "other" category anymore.

mgiglio99 commented 3 years ago

Hi @ericwenger-pm and @allisonheath Apologies for the delay in addressing your term needs. As we get started working on these, I want to make sure that we know all of the terms you need. We have on our list: -Linked-Read WGS (10x Chromium) -Targeted Sequencing Are there any others? Also, we think that we should request a term specific to miRNA sequencing as opposed to you just using 'small RNA sequencing' Thanks, Michelle

mgiglio99 commented 2 years ago

Hi, Sorry to be getting back to this so late.

new term: linked-read sequencing assay parent: DNA sequencing assay, OBI:0000626 definition: A DNA sequencing assay that uses microfluidics to partition and barcode high molecular weight DNA such that short reads derived from fragments of the large piece of DNA can be assembled within the context of the high molecular weight piece of DNA they are derived from, facilitating the use of short read data to sequence and assemble large genomes.

Regarding the targeted sequencing term. Could you please provide more information about exactly what you think would be encompassed by this? How does it relate to amplicon sequencing or exome sequencing - I think these would be types of targeted sequencing - do you agree? Also, do you think every targeted sequencing would be of DNA? Or would this include targeted RNA sequencing as well? Here's a first draft to start with: new term: targeted sequencing assay parent: DNA sequencing assay, OBI:0000626 definition: A DNA sequencing assay in which specific areas of a DNA sequence are sequenced at the exclusion of others.

Thanks, Michelle and Suvvi

mgiglio99 commented 2 years ago

Hello, Following up on this.

We have an OBI id for the linked-read sequence term: new term label - 'linked-read sequencing assay' OBI term - OBI:0003412 You can use this term id starting anytime.

Regarding micro RNA sequencing, we identified an existing OBI term that you can use: http://purl.obolibrary.org/obo/OBI_0001922 OBI:0001922, 'microRNA profiling by high throughput sequencing assay'

Regarding the targeted sequencing term, as we mentioned in our last post, we need a bit more information from you to know how to structure and place this term correctly.

Regards, Michelle and Suvvi