prefix-dev / pixi

Package management made easy
https://pixi.sh
BSD 3-Clause "New" or "Revised" License
3.23k stars 178 forks source link

Add support for a common environment with curated, centrally managed dependencies shared on a multi-user system #949

Open whlavina opened 7 months ago

whlavina commented 7 months ago

Problem description

I work in Bioinformatics at the federal government, where we have need to supporting a large staff of scientific researchers that use High-Performance Computing (HPC) in a regulatory setting with constraints regarding Application Security. I think Pixi is very close to providing a solution that could replace existing tools and practices for making scientific software available to the userbase.

I would love to see support for a common environment with curated, centrally managed dependencies shared on a multi-user system. The requirements as I see them in our organization:

  1. Essential: Ability for a system administrator to define a restricted set of software versions, such that resolution of dependencies by users will only consider the curated set of versions, so that AppSec policies of software review are honored.
  2. Desirable: Ability for a system administrator to install a set of software versions in a shared network location, so that disk space is conserved across the large staff (hundreds of users).
  3. Nice to have: Ability for Pixi to support modules (applications, perhaps also libraries) pre-installed via other means, such as an OS package manager, such that a dependency expressed by a user may be satisfied by the existing software installation, so that one may benefit from declarative dependencies in projects while allowing administrators to customize how some software is built and installed, outside of the Conda ecosystem.
  4. Essential: Ability for a system administrator to upgrade the shared environment and have those updated packages reflected in all environments that use it, so that the maintenance effort of staying current for AppSec compliance can be the responsibility of the system administrators, reducing the burden on the main users.
  5. Nice to have: Ability for workgroups other than the system administrators to provide all of the above, so that one may define team-level dependencies or shared dependencies across a pool of related projects.
  6. Essential: Ensuring that stale software versions are promptly evicted from any disk caches, so that AppSec filesystem security scans are not triggered. This requirement is automatically addressed for the packages from the common environment, if (2) and (3) are supported, since the packages would be installed on the shared disk storage.
  7. Desirable: System logging that includes at least the creation of new environments, so that AppSec may monitor software installations.

Giving some ideas for implementation:

I encourage looking at the solutions offered by these 2 tools, common in HPC environments:

For another take on the use case, see:

ruben-arts commented 7 months ago

He @whlavina,

Thank you for this informative write-up, we've noticed some more interest from the HPC community which we love.

These are some really organization specific requests, as mentioned we are looking for partners to design with. If you would be open for a call we would happily discuss a partnership where we could help each other. Email us at hi@prefix.dev. Or join our Discord and send a personal message to me.

whlavina commented 7 months ago

Thank you for the prompt and optimistic reply! I hope you don't mind that I edited my comment to add an item (7) for logging. :-)

Let me discuss within our organization a little more before the next step of reaching out to your team: I opened this issue given clear blockers that I saw for our adoption of Pixi, and your reply will help me promote the idea and gain support.

ruben-arts commented 7 months ago

Great, reacted to (7) in my initial reaction for completeness. Looking forward to your next steps ;). If you need anymore material or answers to questions that come up, let us know!

chebee7i commented 4 months ago

Hi, another related use case. We generally provide centrally managed environments which 99% of users use by default (stored on a shared filesystem).

pixi seems to make strong assumptions about the conda environment living inside of a project, which is not how we prefer to install our environments. Is there anyway that we can use pixi, or are we stuck on mamba? Is there a way to "activate" one of the environments, so that stuff just works without having to do pixi run?

tdejager commented 4 months ago

Hi, another related use case. We generally provide centrally managed environments which 99% of users use by default (stored on a shared filesystem).

pixi seems to make strong assumptions about the conda environment living inside of a project, which is not how we prefer to install our environments. Is there anyway that we can use pixi, or are we stuck on mamba? Is there a way to "activate" one of the environments, so that stuff just works without having to do pixi run?

I'm not sure about the use-case exactly but given a pixi.toml. You can activate a shell using pixi shell or pixi shell -e <env>. When navigating to other directories, even once containing another pixi.toml the original should be respected. This even works for tasks, although in the cases that you are using an activated shell in a directory with a different pixi manifest you will get a warning.

It's simple to add a command that uses this feature with the --manifest-path to add a kind-of activation that you are used to. However, we do want to support a different workflow with pixi, so if you want the more vanilla conda experience, conda or mamba are your best bet!