lexibank / lsi

CLDF dataset derived from Grierson's "Linguistic Survey of India" from 1928
https://lsi.clld.org
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

Reference Tree compilation #18

Closed PhyloStar closed 3 years ago

PhyloStar commented 4 years ago

@xrotwang and @lingulist. As of now, the tip labels in the nexus files are coming from the slugged language names in the cldf code. The Indo-European nexus file is here:

https://github.com/lexibank/lsi/blob/master/computed/Indo-European_lexstat_infomap.paps.nex

Is there any way to obtain the reference trees (using pyglottolog package) where the tip labels are the same as slugged language names? This would allow me to perform tree comparison directly without having to edit the reference tree manually. Hand editing the tree file for 100 glottocodes can be tedious.

xrotwang commented 3 years ago

I'll have a look. Should be easy enough. So you'd want the Glottolog treem pruned to the tips in LSI, with LSI names as tip labels, correct?

PhyloStar commented 3 years ago

Yes. That is right.

xrotwang commented 3 years ago

glottolog_reference_tree.txt

xrotwang commented 3 years ago

@PhyloStar note that I had to replace braces in LSI names with underscores, see https://github.com/lexibank/lsi/blob/3072a0ba766caf22633e285597c218af340cd113/lsicommands/reference_tree.py#L28

PhyloStar commented 3 years ago

Okay. Thanks. I will try to use it and report any issues. I think things are moving fast.