wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
413 stars 47 forks source link

Could Nanoplot show read quality of ONT reads in fasta format? #267

Closed Yutang-ETH closed 3 years ago

Yutang-ETH commented 3 years ago

Hi,

I am just wondering if the input file format is fasta rather than fastq, how could Nanoplot know the quality of the reads since there is no quality score in the fasta format?

I am using fmlrc2 to correct ONT long reads with short illumina reads, I would like to compare the read quality before and after correction. I have done Nanoplot on the raw ONT long reads which are stored in fastq format. However, the output of fmlrc2 is a fasta format. So I am wondering if Nanoplot could show the quality of reads in fasta file.

Could you please give me any insights? Thank you very much.

Best wishes, Yutang

wdecoster commented 3 years ago

Hi,

Fasta format indeed does not have quality information in itself anymore, in contrast to the fourth line of fastq. What you can do is align the fasta to a reference genome (if available) and then NanoPlot will use the percentage identity compared to the reference genome as quality, when you use bam format as input.

Hope that helps, Wouter

Yutang-ETH commented 3 years ago

Hi Wouter,

Thank you very much for your input. That is very helpful. Unfortunately, we don't have a reference genome. I will figure out another way to check the quality of the corrected long reads.

Best wishes, Yutang

aspitaleri commented 3 years ago

I did it like @wdecoster suggested using qualimap and/or alignQC (https://github.com/jason-weirather/AlignQC)

Yutang-ETH commented 3 years ago

Hi Andrea,

Thank you very much for your suggestion. The thing is we don't have a high-quality reference genome right now and we are trying to build one. Anyway, the tools you recommended are really nice and I believe they will be helpful for my future work.

Best wishes, Yutang