bokulich-lab / q2-moshpit

MOdular SHotgun metagenome Pipelines with Integrated provenance Tracking
BSD 3-Clause "New" or "Revised" License
3 stars 11 forks source link

MAG workflow requires one or more missing actions to generate a DistanceMatrix #87

Open colinvwood opened 8 months ago

colinvwood commented 8 months ago

Currently, the dereplicate-mags action needs a distance matrix to use when dereplicating, but there is no action available in the environment to generate one from SampleData[MAGs]. Thus everything downstream of this action (classification and feature table creation) is unreachable.

@gregcaporaso suggested using actions in the sourmash plugin to generate the distance matrix.

Thus either sourmash needs to be included in the shotgun environment, or we need to borrow the necessary actions and put them here.

gregcaporaso commented 8 months ago

I'm in favor of adding q2-sourmash to the shotgun distro. What do others think?

nbokulich commented 8 months ago

I would be in favor, but @misialq what has been your experience — are there any dependency issues or other issues we should be aware of?

Another issue is that q2-sourmash is not "owned" by us so we have little control and would run into issues if any dependency conflicts or bugs emerge later on. @gregcaporaso did you ever hear back from Titus about this?

btw, long-term the goal is to also add other actions/plugins that can generate such a distance matrix, e.g., with ANI or other methods. So sourmash would not be a strict dependency here (though it effectively is right now).

misialq commented 8 months ago

I support the idea of adding sourmash to the distro - the only reason why I haven't done it is that it doesn't belong to us, as @nbokulich pointed out.

And yes, we would probably add more actions that can generate those + hopefully wrap some existing tools which are used for dereplication (needs review as I remember there were some dependency issues when I tried that originally).

gregcaporaso commented 8 months ago

Update: @ctb added @nbokulich and I as Owners on q2-sourmash, so we should be all set to integrate in the shotgun distro. Let me know what you need to move this forward!

misialq commented 3 months ago

Hey @gregcaporaso, I'm guessing we can proceed with including this in the distro? Can you guys check what is required for that to happen? (I'm assuming the CI on q2-sourmash's side will need an update? )

gregcaporaso commented 3 months ago

@misialq, we should be able to do that for 2024.5 - checking with @ebolyen on that now.

lizgehret commented 1 month ago

hey circling back on this since i'm closing out the 2024.5 project board - should we plan on adding this for 2024.10? cc @gregcaporaso @misialq

gregcaporaso commented 1 month ago

Ah, I lost track of this, sorry! Yes, we should put it on 2024.10, but get it done ASAP so it's in the dev builds. @lizgehret, can you tell me what needs to be done to get this into the distro?

lizgehret commented 1 month ago

The process for including q2-sourmash in the metagenome distro prepares is straightforward (just adding it into the data.yaml file in distributions), we'll just have to see if there are any dependency conflicts that arise from adding it. I can add this next week (after we've run through the first round of prepares for 2024.10.dev0) and keep everyone posted on the process.

lizgehret commented 1 month ago

Ah, so I already see something that will be problematic - in sourmash's recipe file, the python version is pinned at 3.5*. This will need to be updated to a min-pin so that it will be compatible with python 3.9.

cc: @misialq @nbokulich

ctb commented 1 month ago

sourmash v4.8.8 works for python3.10 and up, so maybe it's just a matter of updating the plugin?

lizgehret commented 1 month ago

PR cut in sourmash to update this to a min pin: https://github.com/dib-lab/q2-sourmash/pull/9

gregcaporaso commented 1 month ago

Thanks @ctb and @lizgehret! @ctb, unless you want someone on your end to do it, I'll review that and the other open PR. I'm also going to update the install instructions and automated testing in another PR.

ctb commented 1 month ago

I've got plenty to do 😆 but tag me in to any specific PR if you want input.