genome-in-a-bottle / giab_data_indexes

This repository contains data indexes from NIST's Genome in a Bottle project.
232 stars 71 forks source link

PromethION data de novo assembly? #23

Open Samvkes opened 1 year ago

Samvkes commented 1 year ago

Hi, in the README for the ONT-PromethION datasets (https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/UCSC_Ultralong_OxfordNanopore_Promethion/) under 'Data Processing Methods', alignment of the called reads is mentioned. But in the linked paper (https://doi.org/10.1101/715722) it's explained that the dataset was assembled de novo, only doing alignment afterward for benchmarking (unless I'm misunderstanding).

Also in the README under the 'Data Processing Methods'-header, a newer version of Guppy is mentioned than the one used in the paper, which suggests to me that that part was added more recently, and it contains no information on assembly at all. Does that mean that newer versions of the data are no longer generated de novo?

jzook commented 1 year ago

Yes, these are only mapped reads and not assemblies. ONT assembly methods are quickly evolving, but one recent preprint with an assembly of HG002 ONT reads is at https://doi.org/10.1101/2023.01.12.523790

On Mon, Jun 5, 2023 at 7:20 AM Samvkes @.***> wrote:

Hi, in the README for the ONT-PromethION datasets ( https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/data/AshkenazimTrio/HG002_NA24385_son/UCSC_Ultralong_OxfordNanopore_Promethion/) under 'Data Processing Methods', alignment of the called reads is mentioned. But in the linked paper (https://doi.org/10.1101/715722) it's mentioned that the dataset was assembled de novo, only doing alignment afterward for benchmarking (unless I'm misunderstanding). In the README under the 'Data Processing Methods'-header, a newer version of Guppy is also mentioned than the one used in the paper, and there's no mention of assembly at all, does that mean that newer versions of the data are no longer generated de novo?

— Reply to this email directly, view it on GitHub https://github.com/genome-in-a-bottle/giab_data_indexes/issues/23, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASTU5TCSAB2ADL7Q6X7KGLXJW6IBANCNFSM6AAAAAAY22KEX4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>