nextstrain / public

repo to house broad planning issues and other cross-repo concerns; see also https://github.com/nextstrain/private
0 stars 0 forks source link

Remove pathogen-specific tools from base runtimes #7

Open victorlin opened 3 days ago

victorlin commented 3 days ago

This applies to docker-base and conda-base.

Context

Our base image has accumulated various pathogen-specific tools over time, some of which signficantly contribute to build time and image size. By removing these pathogen-specific tools, we can ensure the base image/environment reflects a continually updated version of Nextstrain tools and their dependencies. Using fauna as an example, more detailed reasoning is in https://github.com/nextstrain/fauna/issues/170.

Candidates

[!NOTE] This seems like the right move for Fauna, but I'm not sure how far we want to take it. As we expand the number of core pathogens that rely on runtimes, the common base will only get smaller.

Tasks

For each pathogen/project that relies on tools that may be removed, create and use a custom runtime that installs the tools. Right now the process may be more involved than it should be, and we should provide a good path for extending the base runtimes (examples: docker, conda).

huddlej commented 2 days ago

Generally, I would love to have a way to use pathogen-specific Docker images in our workflows! That's been my dream since we added pango-learn to the base image back in the early pandemic.

For the specific candidates you mentioned for removal, I can make some specific notes:

I would also recommend removing would be the pango-learn packages and its binary dependencies of gofasta and minimap2, since all of our pango annotations come from Nextclade now.

jameshadfield commented 2 days ago

For each pathogen/project that relies on tools that may be removed, create and use a custom runtime that installs the tools. Right now the process may be more involved than it should be, and we should provide a good path for extending the base runtimes (examples: docker, conda).

This is the other half of workflows as programs, namely the "the artifacts/bundling (keyword: buildpacks) side of things", no?

(And yes, we should totally do this if it's at all feasible -- there are a number of times I haven't done something because I know it's going to be such a hassle / burden to make the needed dependency available to our runtimes.)

tsibley commented 2 days ago

This is the other half of workflows as programs, namely the "the artifacts/bundling (keyword: buildpacks) side of things", no?

Yes, precisely. The whole idea there is that instead of having runtimes and pathogens separately, we have pathogens that are (or contain) their runtimes. We want to avoid having N pathogens and N×M pathogen-runtimes and making the user match them.

The implementation examples Victor gave (and things like ncov-ingest's image) are coming at this from what I'd call a more ad-hoc approach, and I do not think we should go down that path as a way to get to custom runtimes per pathogen. That way lies ecosystem fragmentation and incurs significant usability costs (to both users and developers, us and others).

There's lots of considerations of this work. For example, our runtimes are not small when installed on disk. We're going to want to be able to share a concrete, installed base across pathogens (not just a conceptual base).

We'll also want to consider the cost vs. benefits of moving something out of the base runtimes; it will have non-trivial overhead (both conceptual and actual) and we should only do it when it's worth it. I'm not convinced many candidates given above meet that threshold? What concretely are we gaining with the removal of each?

tsibley commented 2 days ago

@jameshadfield

there are a number of times I haven't done something because I know it's going to be such a hassle / burden to make the needed dependency available to our runtimes

Do you have examples? They would be very helpful to guide both eventual work on this topic but also suggest pain points we might be able to alleviate now with the current base runtimes.

jameshadfield commented 2 days ago

Do you have examples?

The one I was reminded of with this week's avian-flu work is https://github.com/nextstrain/avian-flu/issues/80. There have been a bunch of others along the lines of "can't use this pip dependency, not in our runtimes" but I managed to find an alternate solution so it wasn't a dealbreaker.