artic-network / artic-ncov2019

ARTIC nanopore protocol for nCoV2019 novel coronavirus
Creative Commons Attribution 4.0 International
168 stars 166 forks source link

Adapters not removed from SP1 reads #26

Closed MaestSi closed 4 years ago

MaestSi commented 4 years ago

Hi, I tried out artic pipeline v1.0.0 on SP1 test data using the following instructions, after performing basecalling:

artic guppyplex --min-length 400 --max-length 700 --directory basecalling/ --prefix SP1
mv SP1_.fastq SP1.fastq
artic minion --normalise 200 --threads 4 --scheme-directory primer_schemes --read-file SP1.fastq --fast5-directory SP1-fast5-raw --sequencing-summary basecalling/sequencing_summary.txt nCoV-2019/V1 SP1

I was not completely sure about the nCoV-2019/V1 version of the protocol, could you please confirm the specified version is correct? After the pipeline finished running successfully, I uploaded to IGV the SP1.primertrimmed.rg.sorted.bam file. I was expecting not to see any adapters or PCR primers, but based on the soft-clipping, it looks like those have not been removed. Is this a bug or it is due to the primers scheme specified not matching the actual one? SP1 primertrimmed rg sorted Thanks, Simone

nickloman commented 4 years ago

Indeed, that doesn't look right to me.

You look to have used the right primer scheme.

The expected result is that the edges of the primers are removed through the use of soft clips.

Is IGV showing the soft clips? (E.g. Preferences: Show soft-clipped bases)

Can you try loading into Tablet and seeing what that looks like?

nickloman commented 4 years ago

I loaded an example dataset into IGV. It should look like this:

image

If you turn on "Show soft-clipped bases" it looks like this:

image

This is expected.

MaestSi commented 4 years ago

Yes, IGV is showing the soft-clipped bases, all those coloured portions of the reads are the soft-clipped bases. After seeing your example, I think that what I got is very similar to what you have, don't you think so? I did not know that the edges of the primers were removed through soft-clipping, I was simply expecting them to be trimmed. Tablet visualization looks like this: Tablet Thanks

nickloman commented 4 years ago

OK, sounds like things working as expected then.

MaestSi commented 4 years ago

Yes, thank you for your prompt reply. So, also in case multiple samples are sequenced in a single run and demultiplexing is performed with guppy_barcoder, adapters should not (or must not?) be removed with --trim_barcodes parameter, is it correct? Based on the sop it looks like that is the case. Just one more information: based on the sop, the official artic bioinformatics protocol is the one here, but if I go to the Nanopore website and download pcr-tiling-ncov-PTC_9096_v109_revE_06Feb2020-minion file, the suggested github repository is this one. Which one is the most up to date? Thanks in advance, Simone

nickloman commented 4 years ago

The pipelines are slightly different, but I cannot comment directly on the ONT version. We don't trim adapters but it would be fine if you do.

nickloman commented 4 years ago

Oh sorry, I misread your question. The artic-ncov2019 is the repository you want: it pulls in the fieldbioinformatics pipeline directly.

MaestSi commented 4 years ago

Thank you again. I think you thought I was asking about the nanopolish VS medaka pipelines. Best wishes, Simone