lh3 / miniasm

Ultrafast de novo assembly for long noisy reads (though having no consensus step)
MIT License
296 stars 68 forks source link

Next Step after generating assembly.gfa file. #94

Open UmerFarooq17 opened 5 months ago

UmerFarooq17 commented 5 months ago

Hello.

Can anyone help.guide me regarding what to do next after generating the .gfa file. I have a Pacbio dataset and used following commands to generate .gfa file.

minimap2 -x ava-pb -t 32 longread.fastq.gz longread.fastq.gz | gzip -1 > reads.paf.gz miniasm -f longread.fastq.gz miniasm/reads.paf.gz > miniasm/assembly.gfa

  1. Can you share information about the structure of .gfa file ? what doeas each column represent ?
    • These is 1st row that starts with "S" and the a label followed by a long sequence
    • Then these are some corresponding lines that start with "a" followed by same label

example: (Sequence is cropped just to show here)

S utg000002l GCCATATCCTTGAGGAGATCGTTCAGCGCGCAGAACCGAAAACTGTAT LN:i:87496 a utg000002l 0 SRR9694937.41145:1-8573 - 673

  1. I know gfa can be visualized in Bandage but how to get the fasta assembly file for further downstram analysis like polishing.