hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.91k stars 1.95k forks source link

Java driver supports multiple JVMs on same client #17966

Open rt232 opened 1 year ago

rt232 commented 1 year ago

Proposal

Java was a big win in selecting the platform. Not having to containerise and run java services as-is meant its easy to go from fixed VMs to flexible Nomad in one step. Java used to be slow moving but has changed over the years to a more frequent release cycle and as such, we're seeing that there are different versions running in the organisation, with 8, 11 and some 17. The Java task driver takes a very simple approach - find a JVM and that's the one for the client. So, if we want to run our collection of different versions, we then require clients for 8, another set of clients for 11 and some for 17. As an enterprise, we need fault tolerance and spread between our data centre AZs so that's a proliferation of clients, some of which are running very little. If you have a large fleet of clients, running these multi versions isn't an issue but for an organisation with a small client count, the Java driver causes cost escalation and maintenance overhead.

Use-cases

The Java driver therefore needs to support multiple JVMs so that:

apollo13 commented 1 year ago

I guess this could also be solved by being able to deploy multiple java drivers like suggested for docker: https://github.com/hashicorp/nomad/issues/6554#issuecomment-1050906269

schmichael commented 1 year ago

What @apollo13 mentioned does seem like a good fit for this use case as well. Unfortunately the scheduler does not have visibility into the task.config block, so solutions using that would require users to manually add constraints to Java jobs to select the right version on new fingerprinting logic in the Client.

While having to manually define multiple java plugins is also a bit of work, hopefully it can be done when installing the jvm runtime itself (probably part of a golden image or config management run). So while it's extra work, it at least aligns with work you're already doing (installing the jvm).

This approach could have beneficial side effects as well: perhaps you want to enable an optional feature in a new jvm release. This could be modeled as a plugin config parameter like:

plugin "openjdk18" {
  backend = "java"
  config {
      jvm_path = "/opt/openjdk18/bin/java"
      jvm_options = ["-Dfile.encoding=COMPAT"]
  }
}

plugin "java" {
  config {
    jvm_path = "/usr/bin/java"
  }
}

^ original 2023 post

2024-02-16 Update: Internally I've been referring to this as "plugin aliasing", so I wanted to mention the words alias and aliasing here so it shows up in my Github searches. :sweat_smile: The functionality is the same, but I think I like this terminology better:

plugin "java" {
  alias = "openjdk18"
  config {
    jvm_path = "/opt/openjdk18/bin/java"
  }
}

plugin "java" {
  alias = "jre"
  config {
    jvm_path = "/opt/jre_8u401/bin/java"
  }
}

This maintains the existing meaning of the plugin block and only adds an alias parameter for renaming it.

LLW (last write wins) resolution for multiple drivers with the same name/aliases seems appropriate assuming builtin plugins are loaded first.

That would allow plugin "podman" { alias = "docker" } to automatically use the podman driver instead of docker without having to change any jobspec.

That sort of transparent switch would not support existing local tasks, so it would only be valid on new or drained Client agents. I think that's a reasonable restriction that could be easily documented.

the-maldridge commented 1 year ago

Just another datapoint: the inability to select multiple concurrent JVMs on a machine is the primary impediment to my organization adopting nomad.

116davinder commented 9 months ago

It will also be nice for my org since we run a lot of java based big data applications and currently, we use constraint in job spec to run different applications on different machines with different JVM. It will be great to have this feature.

seems like similar request: https://github.com/hashicorp/nomad/issues/12941