mikelangelo-project / capstan

Capstan, a tool for packaging and running your application on OSv.
http://osv.io/capstan/
Other
19 stars 7 forks source link

Recursive meta/run.yaml proposal #36

Closed miha-plesko closed 6 years ago

miha-plesko commented 7 years ago

With recent update of OSv core we are able to boot unikernel with runscript based on current environment variables i.e. command is built dynamically when runscript is invoked.

Below is an example of how apache.spark package works:

meta/run.yaml in openjdk8-zulu-compact3-with-java-beans

runtime: native
config_set:
  java:
    bootcmd: /java.so -Xms$XMS -Xmx$XMX -cp $CLASSPATH $JVM_ARGS $MAIN $ARGS
    env:
      XMS: 512m
      XMX: 512m
      CLASSPATH: /
      JVM_ARGS: -Duser.dir=/
      MAIN: main.Hello
      ARGS:

meta/run.yaml in apache.spark

runtime: native
config_set:
  master:
    bootcmd: /java.so -Duser.dir=/spark -Xms$XMS -Xmx$XMX -cp /spark/conf:/spark/jars/* -Dscala.usejavacp=true org.apache.spark.deploy.master.Master --host $HOST --port $PORT --webui-port $UIPORT
    env:
      XMS: 512m
      XMX: 512m
      HOST: 0.0.0.0
      PORT: 7077
      UIPORT: 8080
  worker:
    bootcmd: /java.so -Duser.dir=/spark -Xms$XMS -Xmx$XMX -cp /spark/conf:/spark/jars/* -Dscala.usejavacp=true org.apache.spark.deploy.worker.Worker $MASTER
    env:
      XMS: 512m
      XMX: 512m
      MASTER: localhost:7077
config_set_default: worker

Notice how bootcmd is actually a copy-paste from openjdk8-zulu-compact3-with-java-beans package only it sets different environment variables for it. This is what the proposal below aims to resolve.

meta/run.yaml in my package that uses apache.spark

One cannot make use of meta/run.yaml but simply run unikernel with:

$ capstan run --boot worker --env MASTER=172.16.122.3:7077

Proposal

I suggest that we support recursive meta/run.yaml so that user will be able to make use of arbitrary config_set from arbitrary package and only provide environment variables for it. This would result in apache.spark turning into this:

meta/run.yaml in openjdk8-zulu-compact3-with-java-beans

runtime: native
config_set:
  java:
    bootcmd: /java.so -Xms$XMS -Xmx$XMX -cp $CLASSPATH $JVM_ARGS $MAIN $ARGS
    env:
      XMS: 512m
      XMX: 512m
      CLASSPATH: /
      JVM_ARGS: -Duser.dir=/
      MAIN: main.Hello
      ARGS:

(same as before)

meta/run.yaml in apache.spark

runtime: native
config_set:
  master:
    base: "openjdk8-zulu-compact3-with-java-beans:java"
    env:
      JVM_ARGS: -Duser.dir=/spark -Dscala.usejavacp=true
      MAIN: org.apache.spark.deploy.master.Master
      ARGS: --host $HOST --port $PORT --webui-port $UIPORT
      CLASSPATH: /spark/conf:/spark/jars/*
      HOST: 0.0.0.0
      PORT: 7077
      UIPORT: 8080
  worker:
    base: "openjdk8-zulu-compact3-with-java-beans:java"
    env:
      JVM_ARGS: -Duser.dir=/spark -Dscala.usejavacp=true
      MAIN: org.apache.spark.deploy.worker.Worker
      ARGS: $MASTER
      CLASSPATH: /spark/conf:/spark/jars/*
      MASTER: localhost:7077
config_set_default: worker

Notice how we introduce base: "<package>:<config_set>" and then only contextualize that specific config_set with our own environment variables.

meta/run.yaml in my package that uses apache.spark

runtime: native
config_set:
  master:
    base: "apache.spark:master"
    env:
      PORT: 9000
      UIPORT: 9001
  worker:
    base: "apache.spark:worker"
    env:
      MASTER: 172.16.122.3:7077
config_set_default: worker

Notice how we are now able to use apache.spark on our custom ports 9000 and 9001 by default:

$ capstan run demo --boot master  # will run on 9000 and 9001

or specify it by ourselfs (this is already supported, so nothing surprising):

$ capstan run demo --boot master --env PORT=9998 --env UIPORT=9999
gberginc commented 7 years ago

One cannot make use of meta/run.yaml but simply run unikernel with:

What does this mean?

gberginc commented 7 years ago

Generally, I like the proposal. It somehow removes the necessity for the different runtime types (java, node, native) because the base packages can provide the template for the commands.

One thing that slightly worries me is the env vars coming from different packages. For example, I can imagine, e.g. for debugging, that one would compose a unikernel with Spark and OSv's HTTP server, both having the PORT env, rendering this impractical.

Should such problem be left to the user/package maintainer? Namely, should package maintainers ensure that variables are named properly, i.e. OSv's HTTP server would have OSV_HTTP_PORT and Spark would have SPARK_PORT and SPARK_UI_PORT?

miha-plesko commented 7 years ago

One cannot make use of meta/run.yaml but simply run unikernel with:

Oh, it's really strangely written, one can of course use meta/run.yaml. I just wanted to emphasize that inside it she cannot refer to master or slave but has to copy-paste bootcmd for master/slave even if she only wants to just modify the port.

Re environment variables clash: yes, I'd leave it to package maintainer. In other words, I wouldn't worry too much about it since for debugging purposes we usually set the bootcmd using --execute directly.

gberginc commented 7 years ago

Ok, let's leave it for now. However, I suggest you name env variables verbosely in capstan-packages repo. This will introduce a nice best practice. Please proceed with the implementation.

miha-plesko commented 6 years ago

Closing since it's already implemented.