DyogenIBENS / Agora

Algorithm For Gene Order Reconstruction in Ancestors
Other
70 stars 15 forks source link

contiguous ancestral regions (CARs) too small #7

Closed zhangzhiyangcs closed 1 year ago

zhangzhiyangcs commented 2 years ago

Hi, I test three chromosome-level assemblies like this (A:0.5,(B:1,C:1)N1:0.5)N0 using Agora without error. But I get many fragment CARs below. There are only 12 chromosomes of ancestor for these species. Do you have any advice? number CARs 236 CAR_23 237 CAR_21 237 CAR_22 240 CAR_20 241 CAR_19 243 CAR_18 245 CAR_17 250 CAR_16 270 CAR_15 282 CAR_14 284 CAR_13 305 CAR_12 308 CAR_11 315 CAR_10 316 CAR_9 319 CAR_8 326 CAR_7 440 CAR_6 489 CAR_5 512 CAR_4 518 CAR_3 756 CAR_2 802 CAR_1

WeChat8760aea7b046669b0196b32d402b2edf WeChat1b089fd1396c7f068a95f9f672d9ea42

It seems like CAR_39 should be add to CAR_2?

DyogenIBENS commented 2 years ago

Hello, depending of the local rearrangements on each descendant species, AGORA will do its best to detect ancestral continuity. To reconstruct the N0 ancestor of these 3 species (that I guessed are plants, in which occurred a recent whole genome duplication if they are solenacea), you may add more species, and probably outgroups. To linearize ancestral graphs, the algorithm needs more information en edges to choose the good adjacency in case of bifurcations, or will prefer to break the linearization. Moreover, can I ask you what algorithm has been used? agora-basic or agora-plants ? Best, Alexandra

zhangzhiyangcs commented 2 years ago

Hi, Thanks your advice, I try the agora-plants which is better than agora-basic methods. In my study, I want to reconstruct the ancestor assembly of tomato and eggplant based on recently published assemblies. There are some assemblies didn't reach to chromosome-level with N50 about 5Mb. Should I order them to chromosome-level assembly first? And if I add the outgroups such as Convolvulaceae which is relative to the Solanaceae and the number assemblies of tomato and eggplant (nearly 40 assemblies). will it work for construction of better ancestor assemblies.

Besides, there is little question. My species tree is (((Capsicum_chinense.pep:0.0252409,Capsicum_annuum.pep:0.0240399)N3:0.0175153,Capsicum_baccatum.pep:0.0282492)N1:0.0466179,((Solanum_melongena.pep:0.0746262,Solanum_stramoniifolium.pep:0.0471138)N4:0.0307135,(Solanum_septemlobum.pep:0.0572046,((S_habrochaites.pep:0.0373641,((S_peruvianum.pep:0.00923837,S_corneliomulleri.pep:0.0115695)N12:0.00782262,((S_chmielewskii.pep:0.00604881,S_neorickii.pep:0.00387453)N16:0.00643749,((TS112.pep:0.000391208,(TS12.pep:0.000455136,(TS166.pep:0.000311636,(TS171.pep:0.000462916,(TS185.pep:0.00132264,((TS222.pep:0.00165556,(TS281.pep:0.000662984,(TS3.pep:0.000966943,(TS331.pep:0.00122259,(TS39.pep:0.00134176,(TS413.pep:0.00293421,(TS421.pep:0.00251827,(TS60.pep:0.000387697,(TS623.pep:0.00124921,(TS629.pep:0.000523416,(TS692.pep:0.000930574,(TS80.pep:0.00125964,(TS96.pep:0.00185362,TS9.pep:0.000536475)N88:0.000241379)N87:5.16774e-05)N86:1.4978e-10)N85:1.41343e-06)N84:0)N83:1.41343e-06)N81:1.41343e-06)N79:1.41343e-06)N77:0)N75:0)N71:0)N63:0)N57:0,TS204.pep:0.00104421)N51:0)N46:1.41343e-06)N40:0)N33:0)N26:1.41343e-06)N21:1.41343e-06,(((PI303721.pep:0.000793696,PI169588.pep:0.00105927)N34:0.00377143,(MM.pep:0.000631467,(LYC1410.pep:0.000602446,(LA2093.pep:0.00362734,((Fla8924.pep:0.000702611,(EA00990.pep:0.000730518,(EA00371.pep:0.000815361,((BGV007989.pep:0.000645467,(BGV007931.pep:0.00169789,(BGV006775.pep:0.00202178,BGV006865.pep:0.00315751)N82:0.000416142)N80:1.70643e-05)N78:2.28092e-08,Brandywine.pep:0.000949199)N76:7.4813e-11)N72:6.72043e-11)N64:1.41343e-06)N58:2.92826e-11,Floradade.pep:0.000640548)N52:3.0303e-11)N47:4.03226e-11)N41:4.20168e-11)N35:4.24628e-11)N27:0,(PP.pep:0.00287131,PAS014479.pep:0.00307313)N28:0.00463748)N22:0)N17:0.0100754)N13:0.0046876)N9:0.00865879)N7:0.0269158,((ETB_C347:0.0131824,ETB_C351:0.0185872)N10:0.0218274,((C514:0.0275098,((C447:0.015782,C356:0.0169236)N23:0.00968925,C656:0.0234373)N18:0.00588131)N14:0.00647229,((((C450:0.0191488,C390:0.0176664)N29:0.00763655,((C522:0.0176182,C369:0.018946)N36:0.0101215,C813:0.0187475)N30:0.00628604)N24:0.00542426,C361:0.0198491)N19:0.00645283,(C550:0.0209802,((((C031:0.0070568,C580:0.0136637)N42:0.00884751,(((C574:0.0130549,C399:0.0127784)N53:0.00867931,(C419:0.00876492,C454:0.00974861)N54:0.00575909)N48:0.00534487,(((C337:0.00969067,C115:0.00714451)N59:0.00796747,((C426:0.00437438,DM_best.pep:0.00426467)N65:0.00226081,((C118:0.00582822,C058:0.00455901)N73:0.00635447,(C190:0.00387392,C219:0.00716368)N74:0.00488972)N66:0.00443546)N60:0.00380564)N55:0.00451725,(((C005:0.00609156,C098:0.00656238)N67:0.00502117,(C151:0.00537252,C174:0.00584118)N68:0.00467373)N61:0.0113463,((A157:0.00431011,C001:0.00521146)N69:0.00132905,(C056:0.0068361,C093:0.00630395)N70:0.00699816)N62:0.000614525)N56:1.199e-05)N49:0.00804961)N43:0.00594414)N37:0.00158365,(C373:0.0174046,C382:0.0154129)N38:0.00976102)N31:0.00353089,((((C408:0.0121938,C533:0.0108567)N50:0.00635865,C872:0.0138205)N44:0.0044298,(C554:0.0128995,C552:0.0140432)N45:0.00610672)N39:0.00570481,C559:0.0216849)N32:0.0035472)N25:0.00636359)N20:0.00580869)N15:0.00582276)N11:0.00767319)N8:0.0088454)N6:0.0263617)N5:0.0138781)N2:0.0466179)N0.

If I only care about N53 ancestor assemblies. Is there a parameters that help me omit the preparation of other orthologyGroups.%s.list.bz2 and genes.%s.list.bz2 files.

muffato commented 2 years ago

Hi @zhangzhiyangcs

Generally speaking, the more species you give AGORA, the better the reconstructions will be. If you only want to reconstruct the N53 ancestor, add -target==N53 to the command-line (make sure you run on the latest dev branch as I have fixed a bug about this option: #8)

Best, Matthieu