biocore / empress

A fast and scalable phylogenetic tree viewer for microbiome data analysis
BSD 3-Clause "New" or "Revised" License
45 stars 31 forks source link

Error when running with greengenes taxonomy files, integer ID field causes type issue #531

Open mstapylton opened 3 years ago

mstapylton commented 3 years ago

ID column only containing integers causes the dataframe index to be int type which causes the index.intersection code to fail in tool.py: Screen Shot 2021-07-01 at 3 40 03 PM reproduce-files.zip Here's the fix I put in where I cast the index to type string: Screenshot from 2021-07-13 10-39-40

ElDeveloper commented 3 years ago

Thanks @mstapylton, would you be able to submit a pull request with these changes and a new unit test?

mstapylton commented 3 years ago

Sure, I think I can carve out some time next week to put together a PR. I work in the Clemente lab by the way.

ElDeveloper commented 3 years ago

That is absolutely wonderful to hear. Thanks so much, let us know if you need any help, and say hi to Jose! 👋

On Jul 16, 2021, at 7:19 AM, mstapylton @.***> wrote:

Sure, I think I can carve out some time next week to put together a PR. I work in the Clemente lab by the way.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

fedarko commented 3 years ago

Thank you for the detailed information and bug report, @mstapylton!

I downloaded the files you uploaded and tried running Empress using them; I don't know if this will help with the PR / testing, but the behavior I get (using the latest version of Empress as of today, and on my system) seems a bit different from the error message you screenshot here. I'm recording these differences below, just in case it's helpful to you or to future users.

Anyway -- it's not a huge deal or anything, but hopefully this provides some extra context if these problems were coming up during creation of the PR / testing. Thank you again for raising this issue.