obi-ontology / obi

The Ontology for Biomedical Investigations
http://obi-ontology.org
Creative Commons Attribution 4.0 International
75 stars 26 forks source link

Representing specific flow cells for DNA sequencers #1820

Open cmungall opened 2 months ago

cmungall commented 2 months ago

cc @turbomam

If we need to represent Illumina NovaSeq 6000 S4, I assume we would create a subclass of

id: OBI:0400043 name: flow cell def: "Aparatus in the fluidic subsystem where the sheath and sample meet. Can be one of several types; jet-in-air, quartz cuvette, or a hybrid of the two. The sample flows through the center of a fluid column of sheath fluid in the flow cell." [] synonym: "flow_cell" EXACT [] relationship: RO:0000085 OBI:0000370 ! contain function is_a: OBI:0000967 ! container

and then make a part_of link to the sequencer, eg.

id: OBI:0002630 name: Illumina NovaSeq 6000 def: "A DNA sequencer which is manufactured by the Illumina corporation, with two flow cells and an output of up to 6000 Gb (32-40 B reads per run). The sequencer utilizes synthesis technology and patterned flow cells to optimize throughput and even spacing of sequencing clusters." [] synonym: "NovaSeq 6000" EXACT [] is_a: OBI:0400103 ! DNA sequencer

Would this be a new robot template? E.g. something like:

Sequencers flow cell description
Illumina HiSeq 2500 High Output flow cell Higher throughput but longer run times than Rapid Run mode
Illumina HiSeq 2500 Rapid Run flow cell Faster run times but lower output than High Output mode
Illumina HiSeq 3000 patterned flow cell Similar to HiSeq 4000 but with lower throughput
Illumina HiSeq 4000 patterned flow cell High-output flow cell for various applications
Illumina HiSeq X patterned flow cell High-throughput flow cell optimized for human whole-genome sequencing
Illumina iSeq 100 iSeq 100 i1 flow cell Single flow cell type for iSeq 100
Illumina MiSeq v1 Lowest output flow cell for MiSeq
Illumina MiSeq v2 Mid-range output flow cell for MiSeq
Illumina MiSeq v3 Highest output flow cell for MiSeq
Illumina NextSeq 550 High Output flow cell Higher capacity flow cell for NextSeq 550
Illumina NextSeq 550 Mid Output flow cell Lower capacity flow cell for NextSeq 550
Illumina NovaSeq 6000 S1 Lower output flow cell for NovaSeq 6000
Illumina NovaSeq 6000 S2 Mid-range output flow cell for NovaSeq 6000
Illumina NovaSeq 6000 S4 Highest output flow cell for NovaSeq 6000, up to 10B reads
Illumina NovaSeq 6000 SP Lowest output flow cell for NovaSeq 6000, fastest run times
Ion Torrent PGM Ion 314 chip Lowest throughput chip for PGM
Ion Torrent PGM Ion 316 chip Mid-range throughput chip for PGM
Ion Torrent PGM Ion 318 chip Highest throughput chip for PGM
Ion Torrent Proton Ion PI chip High-throughput chip for Proton
Ion Torrent S5 Ion 520 chip Low to mid-range throughput chip for S5
Ion Torrent S5 Ion 530 chip Mid to high-range throughput chip for S5
Ion Torrent S5 Ion 540 chip Highest throughput chip for S5
Oxford Nanopore GridION R9.4.1 flow cell Standard flow cell for GridION (same as MinION)
Oxford Nanopore MinION R10.3 flow cell Newer flow cell with improved accuracy for MinION
Oxford Nanopore MinION R9.4.1 flow cell Standard flow cell for MinION
Oxford Nanopore PromethION R9.4.1 flow cell High-capacity flow cell for PromethION
Pacific Biosciences Sequel SMRT Cell 1M Standard flow cell for original Sequel system
Pacific Biosciences Sequel II SMRT Cell 8M High-capacity flow cell for Sequel II system
turbomam commented 2 months ago

I think this has potential, but OBI:0400043 is a cytometry flow cell, not a sequencer flow cell. I'm pretty sure there's no such thing as a jet-in-air sequencer flow cell.

turbomam commented 2 months ago

I like this but it will result in a lot of pre-composed classes over time. Who's going to maintain them, or keep the set complete?

bpeters42 commented 2 months ago

I think we should share with Chris our issues we have had in trying to maintain definitions for instruments. We initially wanted to follow an approach like Chris has here, but the problems are that:

On Thu, Sep 12, 2024 at 11:35 AM Mark Andrew Miller < @.***> wrote:

I like this but it will result in a lot of pre-composed classes over time. Who's going to maintain them, or keep the set complete?

— Reply to this email directly, view it on GitHub https://github.com/obi-ontology/obi/issues/1820#issuecomment-2346625450, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2ITKYXLZ33DBRON7T6LZWGYCNAVCNFSM6AAAAABOBXSD42VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBWGYZDKNBVGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

turbomam commented 2 months ago

OBI's root DNA Sequencer class can be found here in OLS

aclum commented 2 months ago

Mark and I talked about this a bit offline, I care less about the flowcell IDs since it impacts throughput but not analysis. What I would like to see is 1) Illumina NovaSeq X added 2) Groupings by series (ex Illumina HiSeq, Illumina NovaSeq, etc), this makes the ontology easier to read and we have cases where we don't have the model number (ex we know it is illumina NextSeq but not if it is 500 or 550.

bpeters42 commented 2 months ago

Both of those requests are straightforward to implement, and make perfect sense. I am hoping @turbomam can compile all the manufacturer / series you will need, and get it into OBI.

sebastianduesing commented 2 months ago

Illumina NovaSeq X was added recently in PR 1804, which will go live in the next OBI release.

mgiglio99 commented 1 month ago

I second the motion for grouping terms for the sequencer series. I've also run into situations of not knowing which exact member of the series is in play.

mgiglio99 commented 1 month ago

Discussed on Sept 30 call - general agreement that grouping terms for the sequencer series is a good idea. Mark is going to provide a table with info on related terms from different sources. He will propose grouping terms along the lines of this example: 'Illumina HiSeq Series sequencer' .

turbomam commented 1 month ago

I have started two two new issues and propose closing this issue about flow cells

  1. https://github.com/obi-ontology/obi/issues/1824
  2. https://github.com/obi-ontology/obi/issues/1825
    • following the summary @mgiglio99 gave above