Varient calling pipline

exhgphan commented 2 months ago

Ask away!

Hi Julien,

Ive attached a copy of the report. Again we are using the unzipped files from the fasta_pass folder, included all of our reference sequences in 1 file, and have tried lowering the min_read_qual.

Workflow Amplicon Sequencing report.pdf

julibeg commented 2 months ago

Hi there!

from the fasta_pass folder

Sorry, do you mean the fastq_pass folder?

Also, according to the report it looks like there were only two reads in your data in total. Are you certain that you used the fastq_pass directory of a successful sequencing run? Could you share the MinKNOW reports of this run?

Thanks!

exhgphan commented 2 months ago

I believe we did but I'm currently at home so I only have a limited amount of files but I can send it over in an hour if that is ok. Please do not feel obligated to wait for me to respond with it. We believe had only two reads since the nanopore was expired, we are not sure if that has anything to do with it, since some reads did come through which I have attached here.

julibeg commented 2 months ago

but I can send it over in an hour if that is ok

No worries! Just send them over whenever is convenient for you

It's hard to tell from the screenshot, but the empty white space at the bottom suggests that there is indeed only a single read in that file.

exhgphan commented 2 months ago

Hi, would you know what the miniKNOW report would be called? And yes we took the data from the fastq_pass folder

julibeg commented 2 months ago

In the same directory where you found the fastq_pass directory, there should be several other files (e.g. sequencing_summary_*.txt or report_*.{html,json,md}). If report_*.html is present, please upload this one. Otherwise, the sequencing summary will also do.

Thanks!

exhgphan commented 2 months ago

sequencing_summary_FAX68352_e79adb5a_9ccdd512.txt MinKNOW Run Report-15-04-2024-FAX68352.pdf I believe these are the files which you are looking for

julibeg commented 2 months ago

Hi @exhgphan, it looks like none of the reads were assigned a barcode (they are all unclassified; see screenshot below). If there is a report_*.json file in this directory, could you provide this as well? Thanks!

exhgphan commented 2 months ago

Ok, so we need to also include a barcode sheet defining them when we run this workflow?

exhgphan commented 2 months ago

report_FAX68352_20240415_1526_e79adb5a.json

julibeg commented 2 months ago

Hi @exhgphan, many apologies for the delay!

Ok, so we need to also include a barcode sheet defining them when we run this workflow?

When you used barcodes (e.g. for different samples) and want to analyse them separately, then you need to demultiplex them.

However, if you sequenced only one sample, you can run the workflow again while enabling the analysis of unclassified reads. On the command line, you would do this by adding the --analyse_unclassified flag. In the EPI2ME Desktop Application, you would tick the "Analyse unclassified reads" box in the "Input Options" panel (see screenshot).

Please let me know if that helps.

exhgphan commented 2 months ago

Hi, That makes sense. Was not having a CSV file the cause of this issue?

julibeg commented 2 months ago

Demultiplexing should also work without a CSV. Which library preparation kit did you use?

exhgphan commented 2 months ago

SQK-LSK114 is the ligation kit we used and the barcode expansion kit used was EXP-PBC096

julibeg commented 2 months ago

Hi @exhgphan, many thanks again for providing all the files. I had another closer look and generally it appears that something went wrong with this sequencing run.

Fist of all, output (~3000 reads) is really low. Depending on the number of barcodes that were actually used, this might be too little data even with perfect demultiplexing. The reasons for the low output seem to be low pore occupancy, but also the relatively short sequencing time (three hours). Also, demultiplexing failed for almost all reads due to low barcode scores. Are you certain that EXP-PBC096 was used in the lab and not another barcoding kit (e.g. SQK-NBD114.96)?

Generally, my best guess would be that something went wrong during the library prep. Therefore, I'd recommend reaching out to the ONT customer support (either via the webpage or by sending an email to support@nanoporetech.com) to see if they can help resolve the issue.

If you also want to try more options on the analysis side, I guess there are two things you could do:

try demultiplexing with Dorado and see if you get fewer unclassified reads this way
run the workflow with --analyse_unclassified (but this only makes sense if there was just a single sample)

exhgphan commented 2 months ago

Hi, I think the flow cell was not great quality overall and the run time was so low since it was near the end of day. Yes we are certain of the barcoding kit was EXP-PBC096. We have new data that should be of better quality so I will try again.

julibeg commented 1 month ago

Closing this for now; feel free to re-open if you experience the same problem again!

epi2me-labs / wf-amplicon

Varient calling pipline #16

Ask away!