Closed IsaacDiaz026 closed 7 months ago
Good question. I think it's the second (percentage of alignment), but am not super familliar with this tool . The Ave block degree: 8.92
is the average row count. So the coverage is pretty high on average (assuming you have 11 genomes).
Hello,
I just finished running progressive cactus, and used cactus-hal2maf with --filterGapCausingDupes --dupeMode consensus.
When I check my maf file with mafStats, it reports 845 unique sequences, ordered by # bases present: PWN.Scaffold_3A: 50004273 ( 2.38%) PWN.Scaffold_5A: 38824849 ( 1.85%) PWN.Scaffold_2A: 33553799 ( 1.60%)
I notice that the scaffolds with the highest % bases present belong to my reference genome. But I don't fully understand the percentage. Does this mean 2.3% of PWN.Scaffold_3A is represented in the alignment ? Or does this mean the total length of PWN.Scaffold_3A represent 2.3% of the total sequence? I am trying to get a sense of how succesful the alignment was. Here is the rest of my stats file.
File size: 3.13 GB Lines: 12867087 Header lines: 1 s lines: 10511122 e lines: 0 i lines: 0 q lines: 0 Blank lines: 2355956 Comment lines: 8
Sequence chars: 2350499174 ( 84.65%) Gap chars: 426254196 ( 15.35%) Columns: 309915871
Blocks: 1177821 Ave block area: 2357.53 Max block area: 116655 Total block area: 2776753370 Ave block degree: 8.92 Max block degree: 11 Ave seq field length: 199.76 Max seq field length: 10774