qiime2 / q2-diversity

BSD 3-Clause "New" or "Revised" License
4 stars 45 forks source link

Exposing unifrac.meta #292

Closed wasade closed 2 years ago

wasade commented 4 years ago

This pull request is dependent on qiime2/q2-diversity-lib#26.

The known TODO items are:

wasade commented 4 years ago

Brief note, I'm refactoring some of the beta_phylogenetic tests so they can also use this method. Hoping to push next commit tomorrow

wasade commented 4 years ago

Expecting to push a commit or two today

wasade commented 4 years ago

Specifying an alpha for generalized unifrac, to use with meta, is not currently supported due to biocore/unifrac#117. This is at the moment reflected in the test code, though it may be necessary to do something in the library.

This is a bug but I'm not sure if I can do a minor unifrac release quickly enough for this. Possibly... the bug is simple.

wasade commented 4 years ago

Just had another reasonable item pop up on unifrac for some performance related commits that aren't in conda, so will try to get that release out today

wasade commented 4 years ago

@thermokarst, I think the build is not getting qiime2/q2-diversity-lib#26? I may be looking at the failures incorrectly though

thermokarst commented 4 years ago

@thermokarst, I think the build is not getting qiime2/q2-diversity-lib#26?

That is correct! BW is churning away now, hopefully we'll have new env files in the next few hrs. Worth noting, DockerHub's new rate limiting scheme is putting a damper on BW right now, so we might have some CI downtime while that is addressed. 🤞

wasade commented 4 years ago

Rate limits were meant to be broken...? 🚀

ChrisKeefe commented 4 years ago

Super excited to see this getting added. Just wondering whether there's a reason we went with table and phylogeny instead of tables and phylogenies if users are likely to pass Lists of them. Same question applies to https://github.com/qiime2/q2-diversity-lib/pull/26/

ebolyen commented 4 years ago

I would also second @ChrisKeefe's point on plural for the inputs

wasade commented 4 years ago

For the inputs, the user is entering --i-table <path> --i-table <path> ... as I don't think there is a good way to describe multiple entries for a single argument right now. I think table makes sense here as other actions which take, for example, multiple metadata files I don't believe rely on the plural form

thermokarst commented 4 years ago

I agree with @ChrisKeefe & @ebolyen, I think tables and phylogenies would be more clear (and if we update, we should fix that over in q2-div-lib before the release ships).

--i-tables table1.qza table2.qza table3.qza and --i-phylogenies tree1.qza tree2.qza tree3.qza looks pretty nice to me.

ebolyen commented 4 years ago

You can actually do this now:

--i-tables <path1> <path2> <path3>

(I got tired of not being able to do data/* as input...)

Other list-likes use plural as well.

ebolyen commented 4 years ago

@thermokarst jinx!

wasade commented 4 years ago

Oh, nice!!! I think that's new from when we discussed a few months back? I agree much cleaner, and agree with using the plural form

thermokarst commented 4 years ago

if users are likely to pass Lists of them

I think that that is the point of the method, right @wasade? If that's the case, do we need to validate the length? Does it make sense to run only on a single table/tree pair?

thermokarst commented 4 years ago

I think that's new from when we discussed a few months back?

@wasade, this functionality has been around since mid-2019. I think what we were discussing a few months back was a syntax for making these pairs full-fledged tuples, explicitly saying "this table belongs with this tree."

wasade commented 4 years ago

@thermokarst, ah okay. I thought what I put in here was what was the recommendation from that chat :) but I must have misread. The "most correct" thing would be to validate that each tree/table in a common index position correspond to each other or if QIIME2 was able to ensure the paired relationships. Operating on a single tree/table is fine and is identifical to using regular unifrac

thermokarst commented 4 years ago

Discussed out of band with @wasade & @ebolyen, going to postpone this q2-diversity integration until a future release. In the meantime, the new unifac.meta functionality can be found in q2-diversity-lib (and will be published in the 2020.11 core distribution).

Some remaining questions to solve: do we expose this meta functionality in its own pipeline, do we expose it via the existing beta-phylogenetic pipeline, or do we only leave it exposed through q2-diversity-lib?

lizgehret commented 2 years ago

Hey @ebolyen circling back here - should we revisit this?

ebolyen commented 2 years ago

This one's definitely quite old. @wasade do you or your team have any preferences on the matter?

Rereading the thread, it seems like this is all already implemented in q2-div-lib, so this PR is mostly just copying it into the "diversity" namespace so to speak.

I'm inclined to close it, since it hasn't really come up since and it may make more sense to modify beta-phylogenetic at some point instead. Please feel free to re-open at any point :)

wasade commented 2 years ago

It would be very exciting for meta to be available to the user community. Is it now possible for people to use it? If not, then this should be reopened and we should work toward merge.

ebolyen commented 2 years ago

Yep! It's already available in the default install as qiime diversity-lib beta-phylogenetic-meta-passthrough. It's perhaps not ideally named, but it's definitely been around for a while now.

If we'd like it in this particular diversity plugin it may make more sense to put it in beta-phylogenetic outright, since that's probably what a user would expect. Although the inputs might get tricky to explain.

wasade commented 2 years ago

Is that easy to do?

ebolyen commented 2 years ago

My hunch is no, because we would need to account for the different input type signatures here, list vs singular.

I know we can't take a union of List[X] | X so either that changes, or typemap accidentally does let this work. But then we'd have to double check that the interfaces can tolerate that.

Alternatively, we make the breaking change for beta-phylogenetic to the input parameters and update various call-sites like core-metrics-phylogenetic.


What if we rename it in diversity-lib so that it doesn't have "passthrough", which isn't particularly illuminating for end-users? That would be fast and a breaking change that is unlikely to rile anyone up.

wasade commented 2 years ago

I'm in favor of easy :) We have students actively exploring this method right now directly via the unifrac API, and its historical precedence is impressive, so simplifying use is I think a bit + for the user base

ebolyen commented 2 years ago

Would one of them be able to create a PR to update the name (dropping 'passthrough') and fix this issue: https://github.com/qiime2/q2-diversity-lib/issues/30?

This could make a good first-PR for someone.

wasade commented 2 years ago

@ahdilmore, would you be interested in sorting this out? Basically, this would be to help make UniFrac's meta method more readily accessible via QIIME 2