yukiregista / ConsensusProj

MIT License
2 stars 0 forks source link

複数のサポートのアノテーション #20

Open yukiregista opened 7 months ago

yukiregista commented 7 months ago
atsuhmd commented 7 months ago

ETE3のdocumentの出力newickについて. 複数のsupportには対応していないかも http://etetoolkit.org/docs/latest/tutorial/tutorial_trees.html?highlight=newick#reading-and-writing-newick-trees

wikiにはMrBayesとBEASTがprobabilityを描くために拡張しているみたいな話があったので使えるかも https://en.wikipedia.org/wiki/Newick_format

yukiregista commented 7 months ago

@atsuhmd ありがとうございます! 必ずしもnewickにこだわる必要がない(というかnewickだとアノテーションは難しい?)ので,

atsuhmd commented 7 months ago

(今週は学振の更新とかあってあんまりできないかもです) supportの値が複数入力される状況をoutputでき, かつ, outputしたものをinputとして取り込める, ということですね. 少し調べてまたご報告します 全く調べてないのでただの直感ですが, newickかNEXUSでも情報を取捨選択した上で出力できた方が良い気はしています.

atsuhmd commented 6 months ago

NeXMLフォーマットでの入出力はDendropyでもETEでもできるっぽい. https://dendropy.readthedocs.io/en/main/primer/working_with_metadata_annotations.html http://etetoolkit.org/docs/latest/tutorial/tutorial_nexml.html

todo 1, どっちが良いか両方試してみる 2, supportを片方に絞ることで, newickないしNEXUSで出力する方法について調べる

atsuhmd commented 6 months ago

19-plotの改善 branch NeXMLでの出力&読み込みを追加 dendropyで出力するのがシンプルで良さそうだったのでそのまま実装した. (読み込み&出力部分はETE3に依存していない)

atsuhmd commented 6 months ago

2, supportを片方に絞ることで, newickないしNEXUSで出力する方法について調べる

lengthにsupportを割り当ててnewickで出力するみたいな荒技はできそう

for edge in majority.postorder_edge_iter():
    edge.length = majority.branch_support[int(edge.bipartition)]

ete3のnewickは1つまでのsupportであれば :で綺麗に出力してくれる.
((D:0.723274,F:0.567784)1.000000:0.067192,(B:0.279326,H:0.756049)1.000000:0.807788); http://etetoolkit.org/docs/latest/tutorial/tutorial_trees.html#reading-and-writing-newick-trees

atsuhmd commented 6 months ago

Dendropyでのnewickのsupport出力はできなくはないが, このnewickを読み込めるソフトウェアについて調査が必要 majority.as_string(schema='newick',suppress_annotations=True)

'[&U] (S1:1.0[&branch_support=1.0,transfer_support=1.0],S2:1.0[&branch_support=1.0,transfer_support=1.0],(S3:1.0[&branch_support=1.0,transfer_support=1.0],((S7:1.0[&branch_support=1.0,transfer_support=1.0],S8:1.0[&branch_support=1.0,transfer_support=1.0],((S11:1.0[&branch_support=1.0,transfer_support=1.0],S12:1.0[&branch_support=1.0,transfer_support=1.0]):0.989[&branch_support=0.989,transfer_support=0.989],(S9:1.0[&branch_support=1.0,transfer_support=1.0],S10:1.0[&branch_support=1.0,transfer_support=1.0]):0.885[&branch_support=0.885,transfer_support=0.885]):0.828[&branch_support=0.828,transfer_support=0.9013333333333341],(S4:1.0[&branch_support=1.0,transfer_support=1.0],S5:1.0[&branch_support=1.0,transfer_support=1.0],S6:1.0[&branch_support=1.0,transfer_support=1.0]):0.844[&branch_support=0.844,transfer_support=0.912]):1.0[&branch_support=1.0,transfer_support=1.0],((S35:1.0[&branch_support=1.0,transfer_support=1.0],((S91:1.0[&branch_support=1.0,transfer_support=1.0],S92:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0],(S36:1.0[&branch_support=1.0,transfer_support=1.0],S37:1.0[&branch_support=1.0,transfer_support=1.0],S40:1.0[&branch_support=1.0,transfer_support=1.0],S41:1.0[&branch_support=1.0,transfer_support=1.0],S42:1.0[&branch_support=1.0,transfer_support=1.0],S43:1.0[&branch_support=1.0,transfer_support=1.0],S44:1.0[&branch_support=1.0,transfer_support=1.0],S45:1.0[&branch_support=1.0,transfer_support=1.0],S48:1.0[&branch_support=1.0,transfer_support=1.0],(S38:1.0[&branch_support=1.0,transfer_support=1.0],S39:1.0[&branch_support=1.0,transfer_support=1.0]):0.951[&branch_support=0.951,transfer_support=0.951],(S61:1.0[&branch_support=1.0,transfer_support=1.0],S62:1.0[&branch_support=1.0,transfer_support=1.0],S63:1.0[&branch_support=1.0,transfer_support=1.0],S64:1.0[&branch_support=1.0,transfer_support=1.0],S65:1.0[&branch_support=1.0,transfer_support=1.0],S66:1.0[&branch_support=1.0,transfer_support=1.0],S67:1.0[&branch_support=1.0,transfer_support=1.0],S68:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0],(S69:1.0[&branch_support=1.0,transfer_support=1.0],S70:1.0[&branch_support=1.0,transfer_support=1.0],S71:1.0[&branch_support=1.0,transfer_support=1.0],S72:1.0[&branch_support=1.0,transfer_support=1.0],(S73:1.0[&branch_support=1.0,transfer_support=1.0],(S74:1.0[&branch_support=1.0,transfer_support=1.0],S75:1.0[&branch_support=1.0,transfer_support=1.0]):0.935[&branch_support=0.935,transfer_support=0.935]):0.99[&branch_support=0.99,transfer_support=0.995],((S84:1.0[&branch_support=1.0,transfer_support=1.0],S85:1.0[&branch_support=1.0,transfer_support=1.0],(S81:1.0[&branch_support=1.0,transfer_support=1.0],S82:1.0[&branch_support=1.0,transfer_support=1.0],S83:1.0[&branch_support=1.0,transfer_support=1.0]):0.987[&branch_support=0.987,transfer_support=0.9915]):0.98[&branch_support=0.98,transfer_support=0.99125],(S86:1.0[&branch_support=1.0,transfer_support=1.0],S87:1.0[&branch_support=1.0,transfer_support=1.0],S88:1.0[&branch_support=1.0,transfer_support=1.0],(S89:1.0[&branch_support=1.0,transfer_support=1.0],S90:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0]):0.958[&branch_support=0.958,transfer_support=0.9775]):0.966[&branch_support=0.966,transfer_support=0.9864444444444448],(S80:1.0[&branch_support=1.0,transfer_support=1.0],(S76:1.0[&branch_support=1.0,transfer_support=1.0],S77:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0],(S78:1.0[&branch_support=1.0,transfer_support=1.0],S79:1.0[&branch_support=1.0,transfer_support=1.0]):0.847[&branch_support=0.847,transfer_support=0.847]):0.596[&branch_support=0.596,transfer_support=0.84575]):0.701[&branch_support=0.701,transfer_support=0.8560000000000021],(S46:1.0[&branch_support=1.0,transfer_support=1.0],S47:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0],(((S52:1.0[&branch_support=1.0,transfer_support=1.0],S53:1.0[&branch_support=1.0,transfer_support=1.0]):0.984[&branch_support=0.984,transfer_support=0.984],(S49:1.0[&branch_support=1.0,transfer_support=1.0],S50:1.0[&branch_support=1.0,transfer_support=1.0],S51:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0]):0.772[&branch_support=0.772,transfer_support=0.8875],(S54:1.0[&branch_support=1.0,transfer_support=1.0],S55:1.0[&branch_support=1.0,transfer_support=1.0],S56:1.0[&branch_support=1.0,transfer_support=1.0],S57:1.0[&branch_support=1.0,transfer_support=1.0],S58:1.0[&branch_support=1.0,transfer_support=1.0],S59:1.0[&branch_support=1.0,transfer_support=1.0],S60:1.0[&branch_support=1.0,transfer_support=1.0]):0.762[&branch_support=0.762,transfer_support=0.8851666666666674]):0.575[&branch_support=0.575,transfer_support=0.8012727272727272]):1.0[&branch_support=1.0,transfer_support=1.0]):0.813[&branch_support=0.813,transfer_support=0.9913809523809544],((S95:1.0[&branch_support=1.0,transfer_support=1.0],S96:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0],((S97:1.0[&branch_support=1.0,transfer_support=1.0],(S93:1.0[&branch_support=1.0,transfer_support=1.0],S94:1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0]):0.938[&branch_support=0.938,transfer_support=0.969],(S98:1.0[&branch_support=1.0,transfer_support=1.0],(S99:1.0[&branch_support=1.0,transfer_support=1.0],S100:1.0[&branch_support=1.0,transfer_support=1.0]):0.946[&branch_support=0.946,transfer_support=0.946]):0.593[&branch_support=0.593,transfer_support=0.793]):0.672[&branch_support=0.672,transfer_support=0.8820000000000039]):1.0[&branch_support=1.0,transfer_support=1.0]):0.924[&branch_support=0.924,transfer_support=0.9898787878787889],(S22:1.0[&branch_support=1.0,transfer_support=1.0],S27:1.0[&branch_support=1.0,transfer_support=1.0],(S13:1.0[&branch_support=1.0,transfer_support=1.0],S14:1.0[&branch_support=1.0,transfer_support=1.0],S15:1.0[&branch_support=1.0,transfer_support=1.0],S18:1.0[&branch_support=1.0,transfer_support=1.0],S19:1.0[&branch_support=1.0,transfer_support=1.0],(S16:1.0[&branch_support=1.0,transfer_support=1.0],S17:1.0[&branch_support=1.0,transfer_support=1.0]):0.924[&branch_support=0.924,transfer_support=0.924]):0.919[&branch_support=0.919,transfer_support=0.9538333333333336],(S34:1.0[&branch_support=1.0,transfer_support=1.0],(S28:1.0[&branch_support=1.0,transfer_support=1.0],S29:1.0[&branch_support=1.0,transfer_support=1.0],S30:1.0[&branch_support=1.0,transfer_support=1.0],S31:1.0[&branch_support=1.0,transfer_support=1.0],S32:1.0[&branch_support=1.0,transfer_support=1.0],S33:1.0[&branch_support=1.0,transfer_support=1.0]):0.66[&branch_support=0.66,transfer_support=0.9319999999999945]):1.0[&branch_support=1.0,transfer_support=1.0],(S23:1.0[&branch_support=1.0,transfer_support=1.0],S24:1.0[&branch_support=1.0,transfer_support=1.0],S25:1.0[&branch_support=1.0,transfer_support=1.0],S26:1.0[&branch_support=1.0,transfer_support=1.0]):0.56[&branch_support=0.56,transfer_support=0.7783333333333323],(S20:1.0[&branch_support=1.0,transfer_support=1.0],S21:1.0[&branch_support=1.0,transfer_support=1.0]):0.834[&branch_support=0.834,transfer_support=0.834]):1.0[&branch_support=1.0,transfer_support=1.0]):0.978[&branch_support=0.978,transfer_support=0.9940000000000004]):0.982[&branch_support=0.982,transfer_support=0.991]):1.0[&branch_support=1.0,transfer_support=1.0]):1.0[&branch_support=1.0,transfer_support=1.0];\n'