Closed subwaystation closed 2 years ago
It seems ALIBI
only changes the node order in the GFA, without updating the graph itself:
Before:
H VN:Z:1.0
S 1 AT
S 2 TTAACTCCATC
S 3 TTTGAGAAACATTTAATAATGTAATGTGTTTGT
S 4 CATACAGGGTGAATACAGATGCACGGGAGGCCATAC
S 5 GGTTTAGGCAAAGGGGAGCACAAAAGTTGAAGATGAGGC
S 6 GCTGCC
S 7 AT
S 8 CAATGCTGGGACTTCAGGCCAA
S 9 GGG
S 10 CAGGAGCTGAGGAAGCCACAAGGGAGGACATTTTCTGCAGTTGC
...
P gi|568815592:32578768-32589835 1+,2+,3+,4+,5+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,18+,19+,20+,21+,22+,23+,24+,25+,26+,27+,28+,29+,30+,31+,32+,33+,34+,35+,36+,37+,38+,39+,40+,41+,42+,43+,44+,45+,46+,47+,48+,49+,50+,51+,52+,53+,54+,55+,56+,57+,58+,59+,60+,61+,62+,63+,64+,65+,66+,67+,68+,69+,70+,71+,72+,73+,74+,75+,76+,77+,78+,79+,80+,81+,82+,83+,84+,85+,86+,87+,88+,89+,90+,91+,92+,93+,94+,95+,96+,97+,98+,99+,100+,101+,102+,103+,104+,105+,106+,107+,108+,109+,110+,111+,112+,113+,114+,115+,116+,117+,118+,119+,120+,121+,122+,123+,124+,125+,126+,127+,128+,129+,130+,131+,132+,133+,134+,135+,136+,137+,138+,139+,140+,141+,142+,143+,144+,145+,146+,147+,148+,149+,150+,151+,152+,153+,154+,155+,156+,157+,158+,159+,160+,161+,162+,163+,164+,165+,166+,167+,168+,169+,170+,171+,172+,173+,174+,175+,176+,177+,178+,179+,180+,181+,182+,183+,184+,185+,186+,187+,188+,189+,190+,191+,192+,193+,194+,195+,196+,197+,198+,199+,200+,201+,202+,203+,204+,205+,206+,207+,208+,209+,210+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,213+,214+,215+,216+,217+,218+,219+,220+,221+,222+,223+,224+,225+,226+,227+,228+,229+,230+,231+,232+,233+,234+,235+,236+,237+,238+,239+,240+,241+,242+,243+,244+,245+,246+,247+,248+,249+,250+,251+,252+,253+,254+,255+,256+,257+,258+,259+,260+,261+,262+,263+,264+,265+,266+,267+,268+,269+,270+,271+,272+,273+,274+,275+,276+,277+,278+,279+,280+,281+,282+,283+,284+,285+,286+,287+,288+,289+,290+,291+,292+,293+,294+,295+,296+,297+,298+,299+,300+,301+,302+,303+,304+,305+,306+,307+,308+,309+,310+,311+,312+,313+,314+,315+,316+,317+,318+,319+,320+,321+,322+,323+,324+,325+,326+,327+,328+,329+,330+,331+,332+,333+,334+,335+,336+,337+,338+,339+,340+,341+,342+,343+,344+,345+,346+,347+,348+,349+,350+,351+,352+,353+,354+,355+,356+,357+,358+,359+,360+,361+,362+,363+,364+,365+,366+,367+,368+,369+,370+,371+,372+,373+,374+,375+,376+,377+,378+,379+,380+,381+,382+,383+,384+,385+,386+,387+,388+,389+,390+,391+,392+,393+,394+,395+,396+,397+,398+,399+,400+,401+,402+,403+,404+,405+,406+,407+,408+,409+,410+,411+,412+,413+,414+,415+,416+,417+,418+,419+,420+,421+,422+,423+,424+,425+,426+,427+,428+,429+,430+,431+,432+,433+,434+,435+,436+,437+,438+,439+,440+,441+,442+,443+,444+,445+,446+,447+,448+,449+,450+,451+,452+,453+,454+,455+,456+,457+,458+,459+,460+,461+,462+,463+,464+,465+,466+,467+,468+,469+,470+,471+,472+,473+,474+,475+,476+,477+,478+,479+,480+,481+,482+,483+,484+,485+,486+,487+,488+,489+,490+ *
P gi|568815529:3998044-4011446 1+,2+,3+,491+,5+,492+,10+,493+,12+,494+,14+,495+,496+,16+,497+,498+,499+,18+,19+,20+,21+,500+,23+,501+,26+,502+,28+,29+,503+,31+,32+,504+,35+,36+,37+,38+,505+,506+,41+,507+,44+,45+,508+,47+,509+,49+,50+,51+,52+,53+,54+,510+,56+,57+,58+,59+,60+,61+,511+,512+,64+,65+,66+,67+,513+,69+,70+,71+,72+,514+,515+,516+,517+,74+,75+,518+,78+,79+,519+,520+,521+,82+,83+,84+,522+,523+,524+,525+,86+,87+,88+,89+,90+,91+,526+,93+,527+,95+,528+,529+,530+,531+,532+,97+,98+,99+,533+,101+,102+,103+,534+,107+,535+,536+,537+,538+,539+,112+,113+,114+,540+,541+,542+,543+,116+,117+,118+,119+,544+,121+,545+,125+,546+,127+,547+,133+,134+,135+,136+,548+,549+,138+,139+,140+,141+,550+,551+,552+,143+,553+,554+,145+,146+,147+,148+,555+,150+,556+,152+,557+,155+,156+,157+,558+,159+,160+,559+,162+,560+,164+,561+,562+,563+,166+,167+,564+,565+,566+,171+,172+,567+,568+,174+,569+,176+,570+,178+,179+,180+,571+,572+,184+,185+,186+,187+,573+,574+,189+,190+,191+,192+,575+,576+,194+,195+,577+,197+,198+,199+,578+,579+,202+,203+,580+,581+,205+,582+,583+,584+,585+,586+,587+,588+,207+,589+,590+,210+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,213+,214+,215+,591+,592+,217+,218+,219+,220+,593+,223+,594+,595+,225+,226+,227+,228+,596+,597+,231+,232+,233+,598+,236+,237+,238+,239+,599+,600+,601+,602+,243+,244+,245+,603+,604+,605+,247+,606+,249+,250+,251+,607+,608+,609+,610+,611+,612+,613+,614+,615+,616+,617+,618+,619+,620+,621+,622+,623+,624+,625+,626+,627+,628+,629+,630+,631+,632+,633+,634+,635+,636+,637+,638+,639+,640+,641+,642+,643+,644+,645+,646+,647+,648+,649+,650+,651+,652+,653+,654+,655+,656+,657+,658+,659+,660+,661+,662+,663+,664+,253+,665+,255+,256+,666+,258+,259+,667+,668+,261+,669+,670+,263+,2S 1039 C
...
After:
H VN:Z:1.0
S 781 ATTTTTAACTCCATG
S 1 AT
S 2 TTAACTCCATC
S 3 TTTGAGAAACATTTAATAATGTAATGTGTTTGT
S 491 GGTACAGGGTGAGTACAGATGCACAGGAGGCCATAG
S 4 CATACAGGGTGAATACAGATGCACGGGAGGCCATAC
S 5 GGTTTAGGCAAAGGGGAGCACAAAAGTTGAAGATGAGGC
S 492 ACTGCCATCAAAGCTGTGGGGCTTCAGGCCAAGAA
S 782 GGCACAG
S 783 G
...
P gi|568815592:32578768-32589835 1+,2+,3+,4+,5+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,18+,19+,20+,21+,22+,23+,24+,25+,26+,27+,28+,29+,30+,31+,32+,33+,34+,35+,36+,37+,38+,39+,40+,41+,42+,43+,44+,45+,46+,47+,48+,49+,50+,51+,52+,53+,54+,55+,56+,57+,58+,59+,60+,61+,62+,63+,64+,65+,66+,67+,68+,69+,70+,71+,72+,73+,74+,75+,76+,77+,78+,79+,80+,81+,82+,83+,84+,85+,86+,87+,88+,89+,90+,91+,92+,93+,94+,95+,96+,97+,98+,99+,100+,101+,102+,103+,104+,105+,106+,107+,108+,109+,110+,111+,112+,113+,114+,115+,116+,117+,118+,119+,120+,121+,122+,123+,124+,125+,126+,127+,128+,129+,130+,131+,132+,133+,134+,135+,136+,137+,138+,139+,140+,141+,142+,143+,144+,145+,146+,147+,148+,149+,150+,151+,152+,153+,154+,155+,156+,157+,158+,159+,160+,161+,162+,163+,164+,165+,166+,167+,168+,169+,170+,171+,172+,173+,174+,175+,176+,177+,178+,179+,180+,181+,182+,183+,184+,185+,186+,187+,188+,189+,190+,191+,192+,193+,194+,195+,196+,197+,198+,199+,200+,201+,202+,203+,204+,205+,206+,207+,208+,209+,210+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,213+,214+,215+,216+,217+,218+,219+,220+,221+,222+,223+,224+,225+,226+,227+,228+,229+,230+,231+,232+,233+,234+,235+,236+,237+,238+,239+,240+,241+,242+,243+,244+,245+,246+,247+,248+,249+,250+,251+,252+,253+,254+,255+,256+,257+,258+,259+,260+,261+,262+,263+,264+,265+,266+,267+,268+,269+,270+,271+,272+,273+,274+,275+,276+,277+,278+,279+,280+,281+,282+,283+,284+,285+,286+,287+,288+,289+,290+,291+,292+,293+,294+,295+,296+,297+,298+,299+,300+,301+,302+,303+,304+,305+,306+,307+,308+,309+,310+,311+,312+,313+,314+,315+,316+,317+,318+,319+,320+,321+,322+,323+,324+,325+,326+,327+,328+,329+,330+,331+,332+,333+,334+,335+,336+,337+,338+,339+,340+,341+,342+,343+,344+,345+,346+,347+,348+,349+,350+,351+,352+,353+,354+,355+,356+,357+,358+,359+,360+,361+,362+,363+,364+,365+,366+,367+,368+,369+,370+,371+,372+,373+,374+,375+,376+,377+,378+,379+,380+,381+,382+,383+,384+,385+,386+,387+,388+,389+,390+,391+,392+,393+,394+,395+,396+,397+,398+,399+,400+,401+,402+,403+,404+,405+,406+,407+,408+,409+,410+,411+,412+,413+,414+,415+,416+,417+,418+,419+,420+,421+,422+,423+,424+,425+,426+,427+,428+,429+,430+,431+,432+,433+,434+,435+,436+,437+,438+,439+,440+,441+,442+,443+,444+,445+,446+,447+,448+,449+,450+,451+,452+,453+,454+,455+,456+,457+,458+,459+,460+,461+,462+,463+,464+,465+,466+,467+,468+,469+,470+,471+,472+,473+,474+,475+,476+,477+,478+,479+,480+,481+,482+,483+,484+,485+,486+,487+,488+,489+,490+ *
P gi|568815529:3998044-4011446 1+,2+,3+,491+,5+,492+,10+,493+,12+,494+,14+,495+,496+,16+,497+,498+,499+,18+,19+,20+,21+,500+,23+,501+,26+,502+,28+,29+,503+,31+,32+,504+,35+,36+,37+,38+,505+,506+,41+,507+,44+,45+,508+,47+,509+,49+,50+,51+,52+,53+,54+,510+,56+,57+,58+,59+,60+,61+,511+,512+,64+,65+,66+,67+,513+,69+,70+,71+,72+,514+,515+,516+,517+,74+,75+,518+,78+,79+,519+,520+,521+,82+,83+,84+,522+,523+,524+,525+,86+,87+,88+,89+,90+,91+,526+,93+,527+,95+,528+,529+,530+,531+,532+,97+,98+,99+,533+,101+,102+,103+,534+,107+,535+,536+,537+,538+,539+,112+,113+,114+,540+,541+,542+,543+,116+,117+,118+,119+,544+,121+,545+,125+,546+,127+,547+,133+,134+,135+,136+,548+,549+,138+,139+,140+,141+,550+,551+,552+,143+,553+,554+,145+,146+,147+,148+,555+,150+,556+,152+,557+,155+,156+,157+,558+,159+,160+,559+,162+,560+,164+,561+,562+,563+,166+,167+,564+,565+,566+,171+,172+,567+,568+,174+,569+,176+,570+,178+,179+,180+,571+,572+,184+,185+,186+,187+,573+,574+,189+,190+,191+,192+,575+,576+,194+,195+,577+,197+,198+,199+,578+,579+,202+,203+,580+,581+,205+,582+,583+,584+,585+,586+,587+,588+,207+,589+,590+,210+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,212+,211+,213+,214+,215+,591+,592+,217+,218+,219+,220+,593+,223+,594+,595+,225+,226+,227+,228+,596+,597+,231+,232+,233+,598+,236+,237+,238+,239+,599+,600+,601+,602+,243+,244+,245+,603+,604+,605+,247+,606+,249+,250+,251+,607+,608+,609+,610+,611+,612+,613+,614+,615+,616+,617+,618+,619+,620+,621+,622+,623+,624+,625+,626+,627+,628+,629+,630+,631+,632+,633+,634+,635+,636+,637+,638+,639+,640+,641+,642+,643+,644+,645+,646+,647+,648+,649+,650+,651+,652+,653+,654+,655+,656+,657+,658+,659+,660+,661+,662+,663+,664+,253+,665+,255+,256+,666+,258+,259+,667+,668+,261+,669+,670+,263+,264+,671+,266+,267+,268+,269+,270+,672+,673+,674+,272+,675+,276+,676+,677+,678+,280+,679+,282+,283+,284+,680+,286+,287+,288+,681+,682+,683+,684+,291+,292+,293+,685+,686+,295+,687+,297+,298+,299+,688+,301+,689+,690+,691+,303+,304+,692+,693+,694+,695+,696+,307+,308+,697+,698+,699+,310+,700+,312+,313+,314+,701+,702+,317+,318+,319+,703+,321+,704+,323+,324+,325+,705+,328+,706+,330+,331+,707+,333+,708+,709+,710+,711+,335+,336+,712+,338+,339+,340+,341+,713+,343+,344+,345+,346+,714+,348+,349+,715+,351+,716+,353+,354+,717+,718+,357+,358+,719+,360+,720+,363+,364+,721+,374+,722+,723+,724+,725+,726+,727+,728+,729+,730+,376+,377+,731+,732+,733+,379+,734+,382+,383+,384+,735+,736+,386+,387+,388+,737+,390+,391+,392+,393+,738+,396+,397+,739+,399+,740+,741+,402+,403+,742+,743+,744+,406+,407+,408+,409+,410+,411+,745+,414+,746+,416+,417+,747+,419+,420+,748+,422+,423+,749+,750+,751+,752+,753+,425+,426+,427+,428+,429+,754+,431+,432+,433+,434+,435+,755+,437+,756+,439+,757+,441+,442+,443+,758+,445+,446+,759+,760+,761+,448+,449+,450+,762+,452+,453+,763+,456+,764+,765+,458+,459+,460+,766+,463+,464+,465+,767+,768+,769+,469+,470+,471+,770+,771+,772+,773+,774+,474+,475+,476+,477+,478+,479+,775+,776+,483+,777+,485+,778+,779+,488+,780+,490+ *
This would explain the exact same plots.
Ah, so maybe ALIBI is missing to update the node identifiers and the steps in the paths.
Hi,
ALIBI does not modify node identifiers. It changes the node order in the gfa file, but node identifiers remain unchanged. The new order is specified by the order of 'S' lines in the sorted gfa file.
Hi @anialisiecka,
thanks for the clarification.
When ODGI is sorting its nodes, it is updating their node identifiers, then the edges and paths of the graph. That's why it didn't work here. However, odgi sort
has the -s,--sort-order
option where we were able to update our graph with the sort from ALIBI's GFA. @AndreaGuarracino
I think your way of handling the new node order can be confusing for programs. At least for the ones I worked with so far. They would not care about how the nodes are ordered in the file, but about the node identifier.
Hi there @anialisiecka :)
I am applying ALIBI to a DRB1-3123 pangenome graph which was build with PGGB. I am taking the
seqwish
output of PGGB, as it presents the raw, unlinearized graph. It looks like this:Then I apply ALIBI:
Which yields the exact same graph:
Am I doing something wrong? Here the graph: DRB1-3123.fa.15a1009.2ff309f.seqwish.gfa.zip
How to read the visualization is explained in https://odgi.readthedocs.io/en/latest/rst/tutorials/exploratory_analysis.html#visualize-the-drb1-3123-graph.
Thanks for any feedback!