google / deepsomatic

DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal sequencing data.
BSD 3-Clause "New" or "Revised" License
96 stars 12 forks source link

PacBio case study #5

Closed BiotechPedro closed 8 months ago

BiotechPedro commented 8 months ago

Hi :D

I am super excited by this release. It seems like an actually cool tool to try!

Regarding the PacBio data example, in which sequencer it was produced? I am saying this because PacBio is well known for long-read sequencing but also they have now released the Onso short-read sequencing, so it would be nice that that example is full of details regarding the sequencing library and the sequencer.

Thank you,

Pedro

pichuan commented 8 months ago

Thank you for the feedback.

These data are originally from https://downloads.pacbcloud.com/public/revio/2023Q2/HCC1395/

which has the following description:

OVERVIEW
    This directory includes human whole genome sequencing datasets generated
    on the Revio system for the HCC1395 (tumor) and HCC1395-BL (matched normal) cell lines.

METHODS
    SAMPLES          ATCC genomic DNA CRL-2324D for HCC1395
                     ATCC genomic DNA CRL-2325D for HCC1395-BL
    SHEARING         Megaruptor 3 to target size of 15-20 kb
    LIBRARY PREP     SMRTbell prep kit 3.0
    SEQUENCING       Revio system, 24 hour movie
    ANALYSIS         Generate HiFi reads with methylation calls on the Revio system
                     Align to GRCh38_no_alt_analysis_set with pbmm2 v1.10.0

DATA
    Sample         Yield        Reads   Read length
    ----------  --------  -----------  ------------
    HCC1395-BL  128.0 Gb    7,591,952     16,863 bp
    HCC1395     180.9 Gb   12,662,343     14,285 bp

I'll make sure we add a pointer in the case study doc itself.

BiotechPedro commented 8 months ago

Oh, thank you! That's great :D