WGLab / LIQA

Long-read Isoform Quantification and Analysis
Other
37 stars 12 forks source link

fastq file to bam file #24

Closed zhanglab2008 closed 1 year ago

zhanglab2008 commented 1 year ago

My Nanopore data are fastq files but the input files need to be in bam format. Do you have a way to convert the fastq files to bam files? Or I can just use Samtools to do that. Thanks!

kaichop commented 1 year ago

FASTQ files are just raw sequence reads. You need to use a mapping software (for example, minimap2) to map to a reference genome to generate the BAM file.

On Thu, Sep 22, 2022 at 1:00 PM zhanglab2008 @.***> wrote:

My Nanopore data are fastq files but the input files need to be in bam format. Do you have a way to convert the fastq files to bam files? Or I can just use Samtools to do that. Thanks!

— Reply to this email directly, view it on GitHub https://github.com/WGLab/LIQA/issues/24, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OHTX2GB2S5CD5IKKELV7SGBZANCNFSM6AAAAAAQTH5CRM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zhanglab2008 commented 1 year ago

Got it thanks!

zhanglab2008 commented 1 year ago

I still can't get the bam file with the correct format when I used minimap2 and then samtools to sort and index the bam file. The problem is that the "chromosome" column is missing in the sam/bam file from minimap2. I tried to use both transcript and genome gif file to prepare the refgene file but there is no difference. Do you have a tutorial to guide me how to produce the bam file from the ONT full length cDNA reads? Thanks!

huyustats commented 1 year ago

Please download genome fasta file instead of gtf file when using minimap2. Here is the tutorial of minimap2: https://lh3.github.io/minimap2/minimap2.html. If you have more question about minimap2, please create an issue at https://github.com/lh3/minimap2. Thanks!

zhanglab2008 commented 1 year ago

Thanks for the reply. I was able to get the output after I use the full genome fasta file as reference.