philres / ngmlr

NGMLR is a long-read mapper designed to align PacBio or Oxford Nanopore (standard and ultra-long) to a reference genome with a focus on reads that span structural variations
MIT License
293 stars 40 forks source link

Questions on loading fastq.gz, and output.sam #111

Open rl4940 opened 4 months ago

rl4940 commented 4 months ago

Hi Fritz team!

Thank you so much for developing this aligner. Currently I am using NGMLR along with minimap2, and I have 2 main questions regarding this: below is my current code:

#### running ngmlr ####
ngmlr -t 20 -r $ref -q test_pipe1_standard/hg002head1.fastq.gz -o hg002.sam -x ont

### sorting and indexing the bam file ####
samtools view -b --threads 10 hg002.sam > hg002_out.bam
samtools sort hg002_out.bam -o hg002_out.sorted.bam
samtools index -M --threads 10 hg002_out.sorted.bam

Question 1: how to load multiple fastq.gz files into NGMLR? e.g. HG002 on GIAB has 3 fastq.gz files, so how to load them in? Question 2: the output from NGMLR is SAM format, to eliminate unnecessary files, is there a way to re-set it into BAM output? if so, is there a way to auto-index the BAM file in NGMLR just like --write-index in minimap2?

Thanks! R.L.