Closed miha-plesko closed 7 years ago
I miss a bit what are you actually trying to do. For a), probably not running python in interactive mode. BTW, this would work, and 'ARGS=-' could be default.
./scripts/run.py -e '--env=ARGS=- runscript /run/python'
Maybe your goal can be achieved without modifying commands.cc.
I didn't review the argument substitution code carefully enough, so I forgot if the variable substitution happens before the command is parsed dividing it on spaces, or after. If it's after, it needs a special case for substituting an empty variable, because in that case all shells I know (sh, bash and zsh) make sure to drop this token - not leave behind an empty parameter, so it makes sense we do too. But as Justin said, without even fixing that bug, you can get your stuff to work by using "-" (a minus) instead of an empty string.
For your second question,our parser (commands.cc) supports a quoted string with double quotes, NOT with a single quote. So instead of
capstan run --execute "--env=ARGS='-3 -v /demo.py' runscript /run/python"
I think this would work, with reversed single and double quotes:
capstan run --execute '--env=ARGS="-3 -v /demo.py" runscript /run/python'
Thanks @justinc1 for suggesting -
which resolves issue a) for python package 👍 . However, I was hoping for a more general solution. Let me demonstrate also on openjdk8-zulu-compact1
package what I'm trying to reach. There we put following cmd into /run/java file:
/java.so $JVM_ARGS $MAIN $ARGS
When user runs such unikernel, she is expected to provide environment variable $MAIN
while $JVM_ARGS
and $ARGS
should be optional. Below I list usecases that I would like to support:
1 --execute "--env=MAIN=package1.Main runscript /run/java"
EQUIVALENT: /java.so package1.Main
2 --execute "--env=JVM_ARGS=-Xms512m --env=MAIN=package1.Main runscript /run/java"
EQUIVALENT: /java.so -Xms512m package1.Main
3 --execute "--env=MAIN=package1.Main --env=ARGS=192.168.0.1 runscript /run/java"
EQUIVALENT: /java.so package1.Main 192.168.0.1
4 --execute "--env=JVM_ARGS=-Xms512m --env=MAIN=package1.Main --env=ARGS=192.168.0.1 runscript /run/java"
EQUIVALENT: /java.so -Xms512m package1.Main 192.168.0.1
As far as I can understand, the runscript
needs to handle empty strings correctly when they come from environment variables (like @nyh mentioned above). Based on the following example I'm assuming that this problem is limited to expanding environment variables when we build boot command:
/java.so $ARGS
--execute "--env=ARGS=-version"
The example above works no matter how many spaces I put between /java.so and $ARGS. However, as soon as I put another variable there, the problem arises:
/java.so $EMPTY $ARGS
--execute "--env=ARGS=-version"
The example above fails with error "Error: Could not find or load main class".
@nyh spaces seems to not be supported regardless what quotes we use. My guess is that they are not handled correctly in runscript
. Probably I should open a separate issue for this although there is a chance that fixing problem a) will result in problem b) being fixed as a consequence. Anyway, I'm posting two usecases how I would like to use spaces:
5 --execute '--env=JVM_ARGS="-Xms512m -Xmx512m" --env=MAIN=package1.Main runscript /run/java'
EQUIVALENT: /java.so -Xms512m -Xmx512m package1.Main
6 --execute '--env=ARGS="192.168.0.1 8000" --env=MAIN=package1.Main runscript /run/java'
EQUIVALENT: /java.so package1.Main 192.168.0.1 8000
Regarding the spaces, it is a bit tricky, I guess. I've made a small modification in main
in loader.cc
to print all args that are stored in av
and I get this
lemmy@mike-dev:~/dev/osv⟫ ./scripts/run.py --execute="--env=ARGS='haha hoho' /cli/cli.so"
OSv v0.24-425-g77b1f05
--env=ARGS='haha
hoho'
/cli/cli.so
So the problem is that the ARGS='haha hoho'
part is not available to the boost:program options
as a whole, but already separated so there is no way for it to parse it properly.
Unfortunately, I don't know what is responsible for parsing the boot command and starting the main, but @nyh should probably be able to shed some light whether this can be resolved in some way.
main() in loader.cc already receives ac parsed on spaces, with escaping/quoting ignored. https://github.com/cloudius-systems/osv/blob/master/arch/x64/boot.S#L104-L107
argc and argv are just linker symbols?
nm build/last/loader.elf | grep -e __argv -e __argc
0000000001471170 B __argc
0000000001471178 B __argv
The command line starts as a single string, and the code dividing it at space boundary is in core/commands.cc. As you can see in the "command" class, the string is broken on spaces (boost::spirit::ascii::space) but the double quotes "..." can be used used to avoid breaking on spaces.
Thanks @nyh for pointing out the commands
, but I am either doing something wrong or this is not actually the reason for this. Again, I have resorted to making some printouts in various parts of the loader.cc and the commands.cc files. What I currently understand is that the parse_options
from the loader.cc
is responsible to first extract the options that should apply to the kernel. Afterwards, the application command is parsed.
After putting some more logs, I get the following
1 lemmy@mike-dev:~/dev/osv⟫ ./scripts/run.py --execute '--env=ARGS="mike.Main 42" /cli/cli.so "A B" C'
2 OSv v0.24-425-g77b1f05
3 MIKE: [loader] parse_options start
4 --env=ARGS="mike.Main
5 42"
6 /cli/cli.so
7 "A
8 B"
9 C
10 MIKE: [loader] parse_options done
11 eth0: 192.168.122.15
12 MIKE: [loader] parse_commands from 42" /cli/cli.so "A B" C
13 MIKE: [commands]parse_command_line
14 MIKE: [commands]42" /cli/cli.so "A B" C
15 MIKE: [loader] parse_commands done 42" /cli/cli.so "A B" C
16 MIKE: [commands]parse_command_line
17 MIKE: [commands]/libhttpserver.so &!
The log shows in lines 2-9 that parse_options
already gets the argv
parameter separated by the spaces, despite the fact that I set the --env=ARGS="mike.Main 42"
. I currently do not care about lines 6-9, which are probably handled by the commands
class you mentioned. I am only interested in the way those --env
options are processed and to me it looks like the argv
that is provided to the main
function already splits everyting on the spaces.
@justinc1 I think you may be right, however I have no idea how to verify this 😄. Since simple std::cout
at some places prevent the instance from booting (in some cases, I just get numerous OSv v0.24-425-g77b1f05
messages), I tried setting a breakpoint with gdb, but unfortunately, I don't know how to use it properly. My session was like
connect
osv syms
hbreak parse_options:189
monitor system_reset
c
The instance reboots, but no breakpoint is hit. After hbreak
I get the following info:
Hardware assisted breakpoint 2 at 0x2111f0: file loader.cc, line 148.
Any obvious mistake I am making?
The patch I sent to osv-dev solves one part of problem.
The second part is that args with spaces are not correctly passed to application. Due to https://github.com/cloudius-systems/osv/blob/master/loader.cc#L317-L320, (line += std::string(av[i]) + " ";) - this one should quote argv if it contains space.
For the loader options (--env=ARGS="mike.Main 42"), I forgot that the main thing we complain about :/ To be done.
For debug printf, I used either debug_early("blah"), or plain printf. Both worked for me, also in early boot stages (say in premain()).
but no breakpoint is hit
I'm confused. But as hbreak is limited to 4 breakpoints, I sometimes define mybreak() function, and than I put breakpoint into it. Sometimes I add sleep(5) into mybreak() for first invocation, so I don't need to monitor system_reset.
With above mentioned patch ()
./scripts/run.py -Ve '--env=AAA="aaa bbb" /hello_env.so aa bb'
OSv v0.24-432-g8388445
DBG av[0] = '--verbose'
DBG av[1] = '--env=AAA=aaa bbb'
DBG av[2] = 'asdf'
4 CPUs detected
Firmware vendor: SeaBIOS
Setting in environment: AAA=aaa bbb
...
# this part is environ as seen by /hello_env.so
ENV /*--------------------------
ENV 00 AAA=aaa bbb
ENV 01 OSV_VERSION=v0.24-432-g8388445
ENV 02 OSV_CPUS=4
ENV 03 OSV_IP=192.168.122.15
ENV --------------------------*/
So environ variables with spaces should work.
Adding --env=BBB=
is also OK:
Setting in environment: AAA=aaa bbb
Setting in environment: BBB=
...
ENV /*--------------------------
ENV 00 AAA=aaa bbb
ENV 01 BBB=
@justinc1 I'm not sure what I'm doing wrong, but this doesn't seem to work for me. I've applied the patch from osv-dev but when I run unikernel it still crashes because of spaces:
$ capstan run demo --execute '--env=XYZ="one two" /python /script.py'
Command line will be set based on --run parameter
Created instance: demo
Setting cmdline: --env=XYZ="one two" /python /script.py
OSv v0.24-424-g3e34265
eth0: 192.168.122.15
Failed to load object: two". Powering off.
Am I missing something?
must be runscript vs "plain" cmdline difference, I guess. I will check.
PS: I think github issues just do require sub-issues feature :)
Sadly, there is another uglieness in the command line parsing beyond what I described in my mail - in parse_options (loader.cc) you can see that "due to ... 6991" the code looks for the end of options and handles the first half differently (option parsing like --env) than the second (command line). Sadly, this code (see "find_if" in loader.cc) breaks on spaces. So the code is convinced that the command to run starts on the "two"....
We're using
runscript
to equip Capstan packages with boot command that is very easy to use. In case of python package, for example, we write following boot command into/run/python
file:User then runs unikernel with following boot command:
And this works as expected 👍 However, there are some usecases that we would like to support:
a) empty value
Here we would expect that python interpreter is started just like it's started if we put only
--env=PYTHONHOME?=/pyenv /python
into/run/python
file (so without the$ARGS
part).b) value containing spaces
Is this doable?
/cc @justinc1 @gberginc