avaughn271 / AdmixtureBayes

GNU General Public License v3.0
17 stars 1 forks source link

Population Name Format Issue #1

Closed AlecJacobsen closed 2 years ago

AlecJacobsen commented 2 years ago

Hello,

Thanks for coming up with AdmixtureBayes! It's great to see a bayesian approach to inferring admixture graphs.

The analyzeSamples.py script seems to be unable to handle special characters in the population names, though. When my population names include a hyphen, I get the error "AssertionError: Either the outgroup name was not specified or something is seriously wrong because the number of nodes did not match the size of the trees". When I removed the hyphens the script ran perfectly.

Thanks!

avaughn271 commented 2 years ago

Hi Alec,

Yes, AdmixtureBayes requires all population names to be alphanumeric characters, so no hyphens, commas, etc. This is because behind the scenes of the algorithm, special characters like hyphens are used as delimiters. I have a comment in the README about this under the "Input file" heading. I understand how this can be confusing, as the problem does not show up until the 2nd step of the algorithm, analyzeSamples, even though it is caused by improper input to the 1st step, runMCMC. I have just added a check on the population names in the 1st step with a helpful error message. Hopefully this will make this problem less confusing in the future. Let me know if you encounter any other issues with AdmixtureBayes. I'm always happy to make it easier to use.

-Andrew

AlecJacobsen commented 2 years ago

Thank you very much!