twolinin / longphase

GNU General Public License v3.0
98 stars 6 forks source link

LongPhase 1.7 Release Notes #61

Closed twolinin closed 3 months ago

twolinin commented 3 months ago

Summary

Merge different alignments of a read to improve phasing integrity and adjust parameter weights to enhance phasing accuracy. Allow the use of phased modification VCF to increase the proportion of tagged reads. Address some known issues.

phase (-t 24) v1.6 SW v1.6 #Block v1.6 Block N50 v1.7 SW v1.7 #Block v1.7 Block N50
HG002 ONT R10.4.1 10x 1,137 7,212 774,928 1,117 7,100 807,877
HG002 ONT R10.4.1 20x 1,225 4,257 1,560,226 1,218 4,132 1,654,808
HG002 ONT R10.4.1 30x 1,194 3,499 1,903,900 1,180 3,372 2,042,500
HG002 ONT R10.4.1 40x 1,211 3,045 2,177,620 1,200 2,915 2,332,195
HG002 ONT R10.4.1 50x 1,216 2,797 2,470,461 1,213 2,679 2,606,645
HG002 ONT R10.4.1 60x 1,197 2,627 2,587,166 1,195 2,513 2,830,210

SW: Switch Error

Changes

  1. 60

  2. 59

  3. 58

  4. 57

    • fix the order of SNP, SV, and methylations
    • fix the lack of SV and methylations when SNP SV, and modification co-phasing
    • fix repeat position of SV and methylation
  5. 55

    • Merging different alignments of a read into a single alignment to enhance phasing integrity.
    • Correct the conditions for judging the VCF header.
  6. 52

    • Add two parameters baseQuality and edgeWeight guide to the --help list. Users can now decide what baseQuality threshold and edgeWeight they want to use.
  7. 49

    • The variants used in multiple VCFs will be accessed and stored in their respective chrVariant, even if multiple variants occur at the same coordinate, only one chrVariant from one VCF will be recorded.
  8. 48

    • When utilizing haplotag for read haplotype tagging, it will automatically detect whether the phased SNP file includes phased indels.
    • When performing read haplotype tagging with haplotag, it can use the --mod-file option to input the phased modification VCF.
    • Correcting the condition in ParsingBam.cpp for determining when a deletion variant that fall within the read's deletion cigar.