23andMe / yhaplo

Identifying Y-chromosome haplogroups in arbitrarily large samples of sequenced or genotyped men
Other
103 stars 24 forks source link

Updating ISOGG input #20

Closed anninakarolina closed 2 years ago

anninakarolina commented 2 years ago

Hi, thanks for the useful tool! I was wondering have you considered updating the ISOGG input to match the newest Y-haplogroup tree (2019-2020), given there is a huge increase in markers and updated haplogroups in the newest release compared to the 2016 version?

dpoznik commented 2 years ago

Hi Annina, I'm glad you're finding yhaplo useful!

I would, eventually, like to update the input files to use a more recent ISOGG release. Constructing these files should be straightforward, but I'm not sure when I'll have the bandwidth to conduct a validation analysis. In the meantime, I do have a branch with some partial work identifying and correcting some allele errors in the 2016 ISOGG release, and I hope to find time to complete that at some point this year.

Herst commented 2 years ago

Any updates? I see the other branch hasn't been updated for more than a year.

(BTW, if a clear classification cannot be made, how about returning a list of possible candidates instead?)

dpoznik commented 2 years ago

Any updates? I see the other branch hasn't been updated for more than a year.

@Herst, thanks for your interest in yhaplo! Unfortunately, I haven't yet had a chance to finish off the work identifying and correcting the allele errors from the 2016 ISOGG release. I am hoping to do so at some point in the next two to three months.

(BTW, if a clear classification cannot be made, how about returning a list of possible candidates instead?)

If there isn't information to progress past a particular node of the haplogroup tree, the haplogroup corresponding to that node is returned. All haplogroups descending from this internal node are possibilities. One could use the tree-traversal utilities in tree.py to list these out. Hope that helps.