glennhickey / progressiveCactus

Distribution package for the Prgressive Cactus multiple genome aligner. Dependencies are linked as submodules
Other
79 stars 26 forks source link

Recomend ways to filter cactus result #131

Open zhangzhiyangcs opened 2 years ago

zhangzhiyangcs commented 2 years ago

Hi, Is there a recommend ways to filter cactus result. I used hal2maf to export the maf file below. But I think it is too crush to analysis. Species in my research didn't have whole genome duplication. I just want get accuracy single orthologous region to analysis the evolution of species. Thanks a lot.

result

a s species.7 41722 25 + 108824248 AGGTTGTGAATTCAGTGTAAAAAAA s species.scaffold150296_ 35684 24 + 227144 AGGTTGTGAATTCAGTG-AAAAAAA s species.Chr_2 3333509 25 - 121189074 AGGTTGTGAATTCAGTGTAAAAAAA s species.JAABOM010007334.1 150858493 25 + 258112721 AGGTTGTGaattgagttaaaaaaaa s species.ptg000025l 1689578 25 - 104603436 AGGTTGTGAATTCAGTGTAAAAAAA s species.ptg000062l 624533 25 + 1011275 AGGTTGTGAATTCAGTGTAAAAAAA s species.Chr_4 2552098 25 + 111331358 AGGTTGTGAATTCGGTGAAAAAAAA s species.Chr_4 2628773 20 - 111136906 ----TGTAAATTTAGTGAAAAAAA-

a s species.7 41857 7 + 108824248 TTTATGT s species.scaffold150296_ 35818 7 + 227144 TTTTTTT s species.Chr_2 3333644 7 - 121189074 TTTATGT s species.JAABOM010007334.1 150858632 7 + 258112721 ttttttt s species.ptg000025l 1689713 7 - 104603436 TTTATGT s species.ptg000062l 624668 7 + 1011275 TTTATGT s Solanum_lycopersicum.7 67894261 4 - 68659810 -T--TTT s species.Chr_4 2552233 7 + 111331358 TTTTTTT s Solanum_tuberosumDM.chr07 56375821 4 - 57639317 -T--TTT s Solanum_tuberosumE463.E463_chr07 54748467 4 - 55985717 -T--TTT s Solanum_tuberosum_stenotomumA626.A626_chr07 57480235 4 - 58521000 -T--TTT s species.Chr_4 2628905 7 - 111136906 TTTTTTT s etu.7 67894261 4 - 68659810 -T--TTT