cheeseman-lab / OpticalPooledScreens

Examples and code for pooled optical screens
MIT License
0 stars 0 forks source link

Environment Not Easily Reproducible on Other Machines #5

Open roshankern opened 1 week ago

roshankern commented 1 week ago

We currently use the packages specified in requirements.txt to create the virtual environment for running all OPS code. While this works on the Whitehead Institute (WI) HPC that has some packages and programs initialized, I have been unable to easily recreate and use this virtual environment on 2 other machines that are outside of WI HPC.

To make the use and development of this software accessible from other machines, we should consider updating the venv or moving to a different env system that makes set up and use of the environment easy on other machines.

Let's use this thread to track our approach to this problem!

roshankern commented 1 week ago

This blog post discusses when we one would benefit from using different package management solutions.

Conclusion Choosing the right environment management tool depends on your needs. If you need a simple, easy-to-use tool, venv might be the best choice. If you’re dealing with complex dependencies, Conda env is the way to go. If you need to switch between different Python versions, consider pyenv or virtualenv.

The best tool is the one that fits your project’s needs and makes your development process smoother. Experiment with these tools and choose the one that suits you best.

Given that this project will likely involve complex dependencies, conda seems like a good possible option for our environment management. I was able to create a environment file for conda that make this repo accessible on other machines in https://github.com/cheeseman-lab/OpticalPooledScreens/pull/4.

roshankern commented 1 week ago

The Snakemake documentation also mentions that conda is the recommended way to install Snakemake, because it also enables Snakemake to handle software dependencies of the workflow. This would be particularly helpful if we have multiple workflows that require different environments.

roshankern commented 1 week ago

Added a ops_dev_requirements.yaml file in #4 that I was able to use to run 1_sbs example code on a machine with a fresh conda install. This environment will live in the dev branch and be updated as we refactor code. Hopefully, the work to migrate the code to Python 3.11 is minimal and we don't have large issues with changing package versions.