DRL / blobtools

Modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets
GNU General Public License v3.0
184 stars 44 forks source link

Why use assembled genomes vs raw reads in example workflow 1 #107

Open Rob-murphys opened 3 years ago

Rob-murphys commented 3 years ago

I would leave this question on the google group but the link´is broken and I can't find it myself.

Why use an assembled genome vs raw reads in the example workflow on the readme.io as how can we be sure if some of the sequence in your assembly should not be there in first place?

DRL commented 3 years ago

Hi Lamm-a,

sorry, i don't understand the question.

please elaborate.

cheers,

dom

Rob-murphys commented 3 years ago

Hi Dom,

I mean that why does Blobtools operate via detecting comtamination in an already assembled genome vs before assembly?

Would the down side of using an assembled genome not be that we can't be sure if some of the sequence in your assembly should not be there in first place as it might come from a contaminant?

DRL commented 3 years ago

blobtools just takes information about sequences (length, GC, sequence similarity, coverage, ...) and aggregates it for each sequence.

I mean that why does Blobtools operate via detecting comtamination in an already assembled genome vs before assembly?

before assembly there are only reads ... are you asking why are we not doing something with the reads?

Would the down side of using an assembled genome not be that we can't be sure if some of the sequence in your assembly should not be there in first place as it might come from a contaminant?

I don't understand what you think the alternative is.

Rob-murphys commented 3 years ago

before assembly there are only reads ... are you asking why are we not doing something with the reads?

Yes. Would it be possible to use mapped raw reads in a similar fashion?

Sorry if this is a ignorant question.