Open baozg opened 1 week ago
Hi @baozg ,
I checked the wfmash data, the problem is:
query_end != query_start + Match/Mismatch + INS_size
And the target position looks correct.
Based on this theorem:
query_start + Match/Mismatch + INS_size = query_end
ref_start + Match/Mismatch + DEL_size = ref_end
I checked the output of the old-version(v0.12.1-5-gd6532bc) wfmash, it looks great.
This also inspired me, I will develop a validation paf
command, perhaps can also repair the WRONG paf.
As for FastGA
, It seems to reverse the order of query and target in paf file😵💫. Because if I try to swap the fasta files of target and query, everything works fine. It may be that the target/query order of FastGA's output is reversed, or your input does not meet FastGA's expectations :)
I hope this is helpful to you. Please keep in touch if you have any questions later!
Best regards, Wenjie
> sed -n '64p' Col-CC_Ler-0_MPIBT.wfmash_21.paf
Chr1 32485061 3030000 3079896 + Chr1 32637894 3030124 3080307 49819 50214 21 gi:f:0.996241 bi:f:0.992134 md:f:0.996967 cg:Z:[.....]
> math 3030000+49819+150+31 // q_start=3030000 match_size=49819 mismatch_size=150 ins_size=31
3080000 // The correct query end is 3080000
> sed -n '64p' Col-CC_Ler-0_MPIBT.wfmash_21.paf|sed 's/3079896/3080000/g'|wgatools p2m -g Col-CC.chr.fa.gz -q Ler-0.fa.gz -o test.maf -r // everything is OK
Reopen it for reminder myself to develop validate
sub-cmd 🤖
Whoo hoo
validate
sub-cmd done in https://github.com/wjwei-handsome/wgatools/commit/53c57fa78818b305a78acade66a7fa07b542a7b3
Hi Wenjie,
I have meet some problem when using
wgatools paf2maf
. It could worked with minimap2 and anchorwave PAF output, but it failed with FastGA and wfmash alignment now. I checked the alignment, it looks fine to me. Could you help me find what's the problem with these alignments?https://keeper.mpdl.mpg.de/d/4b78e4b87c0449d3b821/ Col-CC and Ler-0 would be target and query, and wgatools folder for FastGA and wfmash alignments