Closed hagenjp closed 9 months ago
Hey @hagenjp - couple of things came up while I was testing. Quick overview below, but let's meet on Monday to work on this together!
When running core-metrics using the steps listed in the original issue, the following error is raised:
When running core-metrics-phylogenetic, the ignore_missing_samples
parameter is not recognized.
Offhand, I'm not sure why 1. is happening - but I know why 2. is happening. These will be good things to pair on, and this is a great example of why we always test locally before merging - all of the existing tests pass, but this functionality isn't working like we want it to 🙃
Okay so for posterity - @hagenjp and I met to discuss the index cannot be a set
error shown in the screenshot above.
The issue seems to be coming from the _validate_metadata
helper step within emperor
, specifically when index
is assigned to the difference
variable, which is assumed to be a set. pandas >=1.5
does not allow sets to be used in indices or columns, since they are not 'array-like' - which was a breaking change.
We tested that the error was related to this change in pandas 1.5 by creating a QIIME 2 core 2022.8 environment (our latest release prior to upgrading to pandas 1.5). One side note is that we had to set the conda --solver
flag to classic
in order to utilize our 2022.8 environment file (the libmamba solver couldn't solve the environment). That's a separate issue to discuss, but adding here for the record.
Once in a 2022.8 environment, we created a local branch for q2-diversity based off of the 2022.8 release commit and then cherry-picked @hagenjp's commits from this PR to ensure that we were working in a 2022.8 version of q2-diversity that also included the changes from this PR. We then re-ran core-metrics
with the steps described in the associated issue, which were successful.
Next steps are to determine if it's possible to get around this issue in q2-emperor, or if we need to hold off on this PR until the necessary changes are made in emperor (if this is the case, we'll open an issue there).
@hagenjp I'd recommend reviewing everything we did above, and seeing if you can replicate all of this on your end without me. This is a great learning experience with traceback reading, conda, and git 🙂
Okay so final update on this:
We cannot workaround this in q2-emperor, and the culprit lies here within emperor. However, a fix has been rolled out here so that the difference
object is converted to a list before being passed into the index
param. These changes won't be available until the next version of emperor is released, which looks like (from this commit) is slated for July 2024. So we'll have to wait until we can upgrade to this version of emperor before this PR will pass.
We'll leave this open and marked as blocked until we upgrade to the next version of emperor.
@lizgehret thanks for tracking this down. I just noticed I never pushed a release to GitHub when I published 1.0.4 to PyPI (oops!). This is fixed, and I've pushed a pull request to q2-emperor:
@lizgehret thanks for tracking this down. I just noticed I never pushed a release to GitHub when I published 1.0.4 to PyPI (oops!). This is fixed, and I've pushed a pull request to q2-emperor:
Thanks @ElDeveloper! We'll hold off merging that PR until emperor 1.0.4 is available on conda - you're expecting that to be sometime in July of this year?
@lizgehret Better than that, 1.0.4 has been available via conda-forge for the past 6 months: https://anaconda.org/conda-forge/emperor/files
🎉 🐧
@ElDeveloper oh wow, that didn't come up when I searched for available emperor builds on anaconda - in any case, this is great! Thanks! I'll get this PR merged 🙂
EDIT: Okay so additional follow up re: emperor 1.0.4 - I created a 2023.9 amplicon environment with @ElDeveloper's PR branch installed locally (that updates emperor within q2-emperor's recipe to 1.0.4). Conda can solve the environment, so this should be compatible with the amplicon distro. q2-emperor's tests all pass locally w/this version, but we have 3 test failures in q2-diversity to examine (the only other plugin w/an indirect dependency on emperor via q2-emperor) and all tests pass in q2-diversity as well. The functionality in this PR works as expected with the updated version of emperor. So once the test failures in q2-diversity are sorted, We should be good to merge the version pin update PR as well as this one!
Okay @hagenjp, I've gone ahead and merged the version pin update PR in q2-emperor. So remaining to-do's here are:
ignore_missing_samples
parameter is not recognized. This parameter is missing somewhere within the core-metrics-phylogenetic pipeline - take a look in there and add that where it's needed 🙂 Make sure to pull down and install the latest version of q2-emperor with the updated pin in your development environment, and conda install -c conda-forge emperor=1.0.4
in that same environment. You should then be able to effectively test all of this locally! Lmk if you have any questions 🙂
@lizgehret If this passes after the build on Sunday it will be ready for your review! Thanks :)
@lizgehret just waiting on CI but this is ready for your review