nextflow-io / nextflow

A DSL for data-driven computational pipelines
http://nextflow.io
Apache License 2.0
2.77k stars 632 forks source link

Nextflow installation improvements #4380

Open benjaminbrumbaugh opened 1 year ago

benjaminbrumbaugh commented 1 year ago

Five years ago there was a closed bug that was asking for Nextflow to be incorporated into Homebrew. After watching my team of mixed skillsets, scientists and engineers, struggle though a series of installation issues, I think it might be worthwhile to revisit this.

Nextflow has dependencies, and some of them are unusual like specific implementations of OpenJDK. A package manager will manage these dependencies, giving them executable access, and linking them.

The current installation strategy is to use wget (not installed by default on MacOS) to download a bash script with suppressed output, which runs a Nextflow installer script with debug messaging turned off. When this runs, the console hangs, and because of underlying bugs with other dependencies (Capsule), it can never return — or print anything. I think this is not the right installation strategy for Nextflow anymore. Folks that tried using SDKMAN! (also installed with arbitrary downloaded code execution) ran into linking issues where their PATH didn't have Java linked up or JAVA_HOME set.

I think brew install nextflow yum install nextflow and apt-get install nextflow would all be very useful depending on the platforms folks are working on. If that's too much, the folks on non-mac unix variants probably have the required skillset to work around the issues they are likely to run into, but scientists working on MacBooks are less likely to be able to navigate this. So if just Homebrew was supported, I think that would make a huge difference.

Thank you, Ben

bentsherman commented 1 year ago

We are looking into ways to improve the installation process, see #2028 and #2951

benjaminbrumbaugh commented 1 year ago

Thank you @bentsherman. Is it possible to keep this perspective rather than closing this as a duplicate?

bentsherman commented 1 year ago

Sure

pditommaso commented 1 year ago

Adding Nextflow to Homebrew should be relatively easy. We would welcome this if anybody in the community is willing to maintain it.

stevekm commented 1 year ago

fwiw currently I use conda for this, my general purpose Nextflow conda env recipe is here, but I also wish the process could be a little easier. At least in the latest conda Nextflow package nextflow=23.04.1 it seems like the JDK issues might be resolved enough for it to be more reliable than previous conda packages might have been

also worth noting that there are docs https://www.nextflow.io/docs/latest/getstarted.html#requirements about how to install Nextflow best, but honestly I am not sure how realistic it is to expect users to install yet another package manager (sdk) just for the sake of installing Nextflow.

benjaminbrumbaugh commented 1 year ago

SDKMAN! Was much disappoint. It's a tenuous ask if it were awesome, but I'd be won over. It's not that awesome though (currently, anyway).

Conda is an interesting thought. The issue there is that Conda never made the transition from a python package manger to the python package manager. The PIP venv thing has not been super great, and containerization somewhat ate its lunch on top of that. Last time I did a Conda project, sometime early last year, I remember being kind of annoyed by it. Perhaps one of the sub-variants are more streamlined. I can't recall what it was that stopped it from being everything I hoped for, but that doesn't really matter. Bundling with Conda puts Nextflow into a position of forcing another package manager installation, which might be better to avoid. It also makes Nextflow opinionated on python package management and makes it harder to integrate with some existing python projects, including existing Nextflow projects.

It's a good thought and I'm not saying it's the wrong direction, but it has its downsides.

bentsherman commented 1 year ago

What issues have you had with sdkman? It's very lightweight, and in my experience it's the most reliable way to install java across platforms. I would use the nextflow conda package since I already use conda, but a lot of people have issues with the java distribution in conda.

benjaminbrumbaugh commented 1 year ago

I wish I had better feedback. I didn't actually run into any issues with SDKMAN! on my machine. I pulled it up just now and it still is working fine for me. I can't quite recall what issues others were running into. I remember folks having partially installed versions of Java without a $JAVA_HOME or java --version being correct. It's possible folks skipped the source command instruction that followed the installation and then ran into issues.

However, I don't love the wget arbitrary web code execution thing, maybe with a hash validation it could be better, but it's not uncommon for package managers. Some folks are currently running particularly persnickety Linux distros, like CentOS, and perhaps they ran into issues. CentOS users always run into issues, I don't think Nextflow should be overly concerned with that distro. If you run CentOS (please don't) you're constantly working around these kind of things and you probably have built the chops to resolve the issues you'll have.

I do find the theme grating, and I'm worried that means I'm becoming an old man. I've been out of the Java world for a couple of years, and the splintering of Java into a ton of distros after Oracle started the squeeze (as is their business model—do not get in bed with Oracle ever. Ever.), maybe my opinion on not needing a specialized package manager is outdated. Am I an old man shouting into the wind? No. It's the children who are wrong.

stale[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

aiqc commented 3 days ago

The most important thing is that the tutorial install process is done right in a simple way for researchers.

I think https://github.com/nextflow-io/nextflow/issues/2028 has the right approach to overcome the need for figuring out nextflow's $PATH and the separate java installation.

Advanced users will figure out how to manage environments on their own. However, I would recommend adding a paragraph to the installation documentation that makes them aware that Nextflow is officially distributed on docker and bioconda.

### How should Nextflow be installed? There is a precedent for installing Docker and Conda at the system-level. Why? Because you use the same version of these central utilities with all of your projects. If you try to coerce Docker into something like homebrew, then I guarantee that you will run into complications within 1-2 years. What about Conda? Although conda could make sense for a binary like nextflow, [and nextflow integrates with conda](https://www.nextflow.io/docs/latest/conda.html), adding conda installation and activation to the docs is too complicated.
### [Without jpackage] how should Java be installed? If SDKMAN! is the go-to java manager and it can be accomplished with a one-liner, then that's what should be used. For example, I wouldn't recommend installing Python via homebrew, I would recommend conda. There may be issues with the java version that is used by different bioinformatics tools (e.g. [GATK](https://gatk.broadinstitute.org/hc/en-us/articles/360035889531-What-are-the-requirements-for-running-GATK)), but those would likely be dockerized by the time they are incorporated into a production workflow. I wouldn't be comfortable relying on conda to install java.
### Should the dependencies be bundled? jpackage sounds like an exciting solution. Could you point people toward the bundled language and tool with [Docker](https://hub.docker.com/r/nextflow/nextflow)? Sure, but, again, then you complicate the install process by including Docker. More importantly, you will need to coach researchers on how to map volumes, ports, etc. There is no database to manage. Where bundling starts to become more attractive is nf-tower, but that's beyond the scope of the tutorial docs.