Closed nlf closed 8 years ago
I would add:
sudo nfsd stop
/etc/exports
.Some issues:
❯ docker ps
Error: request returned Service Unavailable for API route and version http://%2Fvar%2Frun%2Fdocker.sock/v1.21/containers/json, check if the server supports the requested API version
This was on my dlite-err.log
:
operation not supported by device
rdmsr to register 0x34 on vcpu 1
it's not encoded, that's just what the cli returns. give it a minute and try again and it should work
Looks like the docker service isn't starting inside the VM.
can you ssh into the VM? see if it's running
yep, it isn't running - trying to debug the rc script
let me know what you find
I can't resolve dns inside the vm. wget: unable to resolve host address 'get.docker.io'
I think it's pointing to my previous VM:
$ cat /etc/resolv.conf
192.168.64.1
nameserver 192.168.64.1 # eth0
what happens if you delete the top line?
you can also try specifying a dns server in the config, do
dlite config
and set dns_server
to 8.8.8.8
. it'll stop and restart the daemon for you. the missing "nameserver" at the beginning of the string is a bug though, working on that right now (though i'm curious what else you find is broken)
Using dlite config
:
8.8.8.8
nameserver 192.168.64.1 # eth0
Setting nameserver manually to 8.8.8.8 worked for me. What's running on 192.168.64.1?
that should be your host machine, which is why dhcp gives that for the eth0 device. interesting that it works for me (and a few other people i had test) but not for you.. can you ping 192.168.64.1 from inside the vm?
Yep, it's reachable from inside the VM.
I'm guessing if you run nslookup google.com 192.168.64.1
from inside your vm it fails?
Yep, it fails. Could be something on my host machine, but I can correctly resolve dns on it.
that's really weird.. is your firewall turned on? do you have any adblockers that work on the dns level? or things like hands off installed?
maybe i should put the default back to google's dns server
Yay, it looks like my dnsmasq service had some interference with it. I've stopped it just to remove it from the equation and restarted the VM. It looks good now.
Found two more issues. When editing with dlite config
, it often outputs this:
virtio_net: Could not create vmnet interface, permission denied or no entitlement?
Apparently it likes sudo
.
The other issue is with extra args. I think it's not passing them to /proc/cmdline:
console=ttyS0 hostname=dlite uuid=7647d5a7-e32b-11e5-b178-c42c032ff040 dns_server=192.168.64.1 user_name=<username> user_id=501 docker_version=1.10.2 docker_extra=
it shouldn't need sudo, maybe it just needs a delay between stopping and starting or something. it runs the same function that dlite start
does, which calls launchctl stop local.dlite
without sudo.
that's strange about your extra args, it's definitely passing them to xhyve, maybe something else is dropping them. i'll poke around a bit
what did you try to pass as extra arguments?
Yeah, I checked the code and it seems fine. There's something sketchy about stopping/starting dlite so it could just be the machine not picking up the new config because it's not getting restarted.
i did just find a small bug with extra parameters in the init scripts, i'll have a new test build of dhyve-os up in a few minutes
Ah yeah, got it :) anything after a space will be ignored.
What about keeping cat /proc/cmdline | sed -n 's/^.*docker_extra=\(.*\)$/\1/p
?
Found two more warnings - one on the host:
XHYVE: vlapic callout at 0x15948.0x597a03ab6634e9a1, expected at 0x15948.#59a953c7eb92360f
Another inside the VM:
time="2016-03-06T00:17:44.710000000Z" level=warning msg="Your kernel does not support cgroup blkio throttle.write_bps_device"
time="2016-03-06T00:17:44.710000000Z" level=warning msg="Your kernel does not support cgroup blkio throttle.read_iops_device"
time="2016-03-06T00:17:44.710000000Z" level=warning msg="Your kernel does not support cgroup blkio throttle.write_iops_device"
that's what i'm doing with the regex.
as for the warnings, the top one isn't anything to worry about. it's just a warning from xhyve which still outputs a fair amount of debugging information.
the warnings from docker are interesting though, i'll take a look now. could be i goofed something in the kernel config.
The config command is having difficulties in restarting the agent. The virtio_net
is always output to the log.
try manually stopping the agent and waiting a minute or so and then start it back up, maybe we just need to introduce a delay
dlite stop
is not working, although no output is shown. Should /usr/local/bin/dlite
run as root?
not for the stop command, no. running dlite stop
with sudo will definitely fail
start and stop with patience (watching the logs in these case) works reliable (always without sudo). I think dlite config needs a delay or a more reliable way of finding out when the agent is stopped.
stop the service and run dlite update -v v3.0.0
to pull the latest OS i just pushed, it resolves the nameserver issue, the docker extra args issue, and the cgroup warnings from the docker logs.
yeah i agree, i'll work something up that checks the output of ps
to make sure the daemon is really stopped all the way.
Looks good! Took me some tries to get the VM to update to the latest 3.0.0 commit.
Here's how resolve.conf appears:
$ cat /var/log/resolv.conf
nameserver 192.168.64.1
nameserver 192.168.64.1 # eth0
The extra args are parsing correctly now.
If you can build from master, try what I just pushed. I put in a loop with a delay to make sure the agent is stopped before StopAgent() returns, let's see if that fixes the config command for you.
hi @nlf! thanks so much for dlite and for 2.0 — super excited for virtfs.
after dlite stop
, dlite install 3.0.0
, dlite update -v v3.0.0
and dlite start
, docker ps
seems to hang even after waiting for quite a bit for things to come up.
is sshing into the vm as straightforward as before? it now prompts for a password, and docker
for neither the docker
user nor root
as on the dhyveos README seems to do the trick.
@wbinnssmith did you remove your old installation first?
the upgrade steps should be something like:
dlite stop
sudo dlite uninstall
sudo nfsd stop
sudo dlite install -v v3.0.0
dlite start
ah yep, didn't get the prerelease binary from releases. derp. check back in soon :)
Building from master - I can reliable edit the config (stop works very well) but occasionally the virtio_net
message appears and starting the machine takes longer (15 seconds vs 2-3 seconds). Any idea?
how occasionally? every other time? every third time? when it happens does the machine boot anyway or do you have to run dlite start
again?
I can't seem to find a pattern yet. Sometimes the machine takes longer to boot and eventually boots, but sometimes I have to dlite start
it again. I'll see if I can replicate it more consistently.
also see if stopping it manually and waiting a while before starting it again helps
@nlf just wanted to let you know that after installing the prerelease binary, all is great. Love the addition to the ssh config, and p9 is working great with volume mounts :smile:
fantastic, thanks for letting me know @wbinnssmith
Should dlite uninstall
stop the daemon or isn't that really required? What about cleaning up /var/db/dhcpd_leases
?
~/.ssh/config
is being written as root but should be kept owned by the current user.
Is /sbin/nfsd
still necessary on the sudoers file?
Technically, sudo dlite install
does not touch ~/.ssh/config
, only start does.
Having some issues when pulling/extracting images via docker-compose
:
Traceback (most recent call last):
File "<string>", line 3, in <module>
File "compose/cli/main.py", line 56, in main
File "compose/cli/docopt_command.py", line 23, in sys_dispatch
File "compose/cli/docopt_command.py", line 26, in dispatch
File "compose/cli/main.py", line 191, in perform_command
File "compose/cli/main.py", line 524, in run
File "compose/cli/main.py", line 711, in run_one_off_container
File "compose/project.py", line 316, in up
File "compose/service.py", line 352, in execute_convergence_plan
File "compose/service.py", line 253, in create_container
File "compose/service.py", line 280, in ensure_image_exists
File "compose/service.py", line 771, in pull
File "compose/progress_stream.py", line 18, in stream_output
File "compose/utils.py", line 63, in split_buffer
File "json/decoder.py", line 369, in decode
ValueError: Extra data: line 3 column 1 - line 238 column 1 (char 190 - 43311)
docker-compose returned -1
With docker pull
, everything works fine (workaround for now).
Hey, I am trying to get it to work. One thing I found weird is that as docker
is now installed as part of a script, when running docker ps
inside the VM, nothing is printed in case docker failed to download. Is there a way to come with docker pre-installed as before?
Edit: I guess to me it doesn't really matter if it comes pre-installed or not, just wondering what are the benefits of not having it pre-installed.
Edit: Comment deleted.
This release -as previous ones- mounts volumes with wrong permissions 501:dialout
Version 2.0.0 of DLite is ready for testing!
Here's what you can do to help:
First, remove your old installation of DLite
You'll also want to edit
/etc/exports
and remove the entry that DLite createdBuild from the latest code in the
master
branch (copy the binary to your path if you want. if you installed with homebrew before you'll want tobrew uninstall dlite
first) or download the latest pre-release binary on the releases page and install passing the-v
flag like so:After the installation completes, run
dlite start
and wait a minute or so. If your internet connection is slow and the version of docker requested in your config is not 1.10.2 it will take longer since on the first boot the docker binary gets downloaded, and it's over 30MB. You'll know it's done and running whendocker ps
works.Please report any issues you have here and I'll work to get them fixed up before the official release. Thanks!
Edit: You're also welcome to join the gitter for questions or just to say hi