Closed nlf closed 8 years ago
@nlf Yes there is
./dlite stop
Stopping the agent: done
~/tmp
❯ hdiutil info
framework : 417.1
driver : 10.11v417.1
================================================
image-path : /Users/antonio/.dlite/disk.sparseimage
image-alias : /Users/antonio/.dlite/disk.sparseimage
shadow-path : <none>
icon-path : /System/Library/PrivateFrameworks/DiskImages.framework/Resources/CDiskImage.icns
image-type : sparse disk image
system-image : false
blockcount : 41943040
blocksize : 512
writeable : TRUE
autodiskmount : false
removable : TRUE
image-encrypted : false
mounting user : root
mounting mode : <unknown>
process ID : 491
/dev/disk2 FDisk_partition_scheme
/dev/disk2s1 Linux
/dev/disk2s2 Linux_Swap
/dev/disk2s3 Linux
@antoniocanas is it still there after a dlite rebuild -d 100
?
Yes
ah ha :) for some reason detaching the volume isn't working correctly, what do you get if you run hdiutil detach /dev/disk2
?
❯ hdiutil detach /dev/disk2
"disk2" unmounted.
hdiutil: couldn't eject "disk2" - Resource busy
and you're sure dlite has stopped? check ps aux | grep dlite
This is strange...
❯ ./dlite stop
Stopping the agent: ERROR!
The agent is already stopped
~/tmp
❯ ps aux | grep dlite
root 483 3.1 0.3 575542528 28784 ?? S 6:43PM 3:07.08 /Users/antonio/tmp/dlite daemon
antonio 4699 0.0 0.0 2423976 292 s001 R+ 8:16PM 0:00.00 grep --color=auto dlite
ahh, it didn't die for some reason.. you can do sudo kill 483
and wait a minute to see if it dies, if it doesn't do sudo kill -9 483
to kill it forcibly
Right, now it's working, Thanks a lot
:clap: works perfectly, I can now boot my mariadb container. I'll test some more on my day-to-day setup and see if I can get into any more trouble :)
I have an issue with the dhyve-os image (v3.0.0-beta4).
Docker does not start at all, so when i ssh to the machine I can see it is not running. When i try to run the init.d it says ok, but does not start. After some digging I found out that docker binary has size 0:
$ ls -lA /usr/bin/docker
-rwxr-xr-x 1 root root 0 Mar 15 18:59 /usr/bin/docker
I tried to fix that by downloading a docker release, but the internet does not work inside the machine.
$ ping google.com
ping: bad address 'google.com'
$ cat /etc/resolv.conf
nameserver 192.168.64.1
nameserver 192.168.64.1 # eth0
Where 192.168.64.1 is my machine. But I guess it does not run dns server?
dig google.com @192.168.64.1 ruby-2.3.0
; <<>> DiG 9.8.3-P1 <<>> google.com @192.168.64.1
;; global options: +cmd
;; connection timed out; no servers could be reached
Fixed that by reinstalling dlite with --dns-server=8.8.8.8
.
But then dlite does not really start. On first attempt I can't ssh to it and on second it keeps starting the agent. And I could find in the logs:
virtio_net: Could not create vmnet interface, permission denied or no entitlement?
Looks like I forgot to reinstall it with -v 3.0.0-beta4
. After doing that and using --dns-server=8.8.8.8
it works correctly!
@mikz good to know! i'm still not sure why dns doesn't work correctly out of the box for some people, that's why the --dns-server option exists :) i should probably document that better though
AFAIK OSX will start DNS server if you have Internet Sharing enabled in System Preferences > Sharing.
I don't have that enabled and dns works fine for me. I do know that some people have had dnsmasq running on their host system and that has caused issues. Still haven't been able to see if there are other situations that cause a failure there.
I still have problems with PHP's Composer:
- Installing symfony/console (v3.0.3)
Downloading: 100%
Failed to download symfony/console from dist: Could not delete /app/lumen/vendor/composer/254938c2:
Now trying to download from source
- Installing symfony/console (v3.0.3)
Cloning 2ed5e2706ce92313d120b8fe50d1063bcfd12e04
The disk hosting /app/lumen/vendor is full, this may be the cause of the following exception
Thats it! I have dnsmasq running on ::1:53
and 127.0.0.1:53
. I don't really want to run it on all the interfaces.
Guess some dns proxy running on that interface and random port would make sense. Like http://pow.cx/docs/dns_server.html. But initially dlite could check if there is dns server running and print a warning or something.
@antoniocanas can you verify with df -h
from an ssh session to the vm that there is disk space? if there is, can you reproduce this in a small test scenario that you can send me so i can try to figure out what's going on?
@mikz excellent! i'll have to see what i can do. if i can detect a process listening on port 53 i can print a warning that should help other users not have to deal with this
@nlf There is enough space. You could test it by just running:
docker run --rm -v $(pwd):/app composer/composer require symfony/console
that doesn't work for me, i get an error when it tries to clone https://github.com/symfony/console.git
oh. nevermind, i see. that's the error you're getting too :) i just noticed the disk full error at the top
Yep. It works if you remove the -v argument, so it's someway related to the volume
it also works if you run the container with bash as the entrypoint and clone the repos in your home directory, then copy them to the /app/vendor directory. it appears something about composer doesn't play nicely with 9pfs
ok, looks like i may not be correctly reporting space. i'm going to look into the 9p spec and see if i can figure out what i need to change to support this.
@nlf It's weird that just some kind of php packages fails, also composer has a 'diagnose' argument that shows: Checking disk free space: OK
Maybe the clue is in this line of the error:
Failed to download symfony/console from dist: Could not delete /app/lumen/vendor/composer/254938c2
it's all a side effect of the 9p server not reporting the free disk space correctly. i'm reading through protocol docs to try to find a fix
Anyone seeing their containers self-destruct on reboot/hard shutdown with 2.0.0? My Macbook has a "cold bug" (shuts down below an internal temp of roughly 20 degrees Celsius (yes, really)) and dlite appears to just rebuild when the system comes back up. Not a big deal since my containers are small, but could be a bigger pain in the future.
@STRML i have not seen that.. i do know that a hard shut down can cause corruption on the vm's disk (just like it can on your host's disk, ouch) but i've definitely never seen it just clear out everything
Yeah, daily kernel panics are not fun - out of warranty. I'll keep an eye on it, sounds like it's not something dlite could reasonably handle. Thanks.
just so everyone is aware, the free disk space issue seems to be inherent with 9p2000 protocol shares as there's no "statfs" message sent to the server, which is what would inform the guest of free disk space. there's a 9p2010 spec that adds this, but it looks like support in the kernel is lacking so i'm at a bit of an impasse.
my current plan is to evaluate embedding a usermode nfs server into dlite, with some changes to allow for better user mapping from container/vm to host (basically just using extended attributes like my 9p changes did). it may take a while and i'm not even sure if it'll work, but it's worth taking a look to see if it's even feasible.
with official node:
npm ERR! Linux 4.4.3-dhyve
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "install" "--save" "hapi"
npm ERR! node v5.4.0
npm ERR! npm v3.3.12
npm ERR! path /app/node_modules/.staging/hoek-4b4bf751ce5c2289f71352c6623520d9
npm ERR! code EXDEV
npm ERR! errno -18
npm ERR! syscall rename
npm ERR! EXDEV: cross-device link not permitted, rename '/app/node_modules/.staging/hoek-4b4bf751ce5c2289f71352c6623520d9' -> '/app/node_modules/hapi/node_modules/hoek'
yeah the 9p share has all kinds of issues unfortunately. it's gonna be going away before 2.0.0 is officially released
For those running 2.0.0-beta5
with the firewall on, you will need to allow connections to bootp in order for the vm to startup
About that - did the 2.0.0-beta5 tag on https://github.com/nlf/dhyve-os/releases disappear a while back? Just attempted to do a re-install and it failed hard when it could not find the release :/
if you mean 3.0.0-beta5, then yes. i removed it because it had a severe bug. 3.0.0-beta4 is what you should be using
What was the severe bug?
to be completely honest, i don't remember. i do recall that i pulled the release within an hour or so of pushing it, though
ah yes, 3.0.0-beta5 - no problem then :)
Been running on 3.0.0-beta4 for a while now, it's pretty stable. My main issue currently is filesystem performance. I'm mainly using Docker for Drupal development, and Drupal has a habit of doing a lot of file-stat operations. As they seem to be slow even running through 9p, the bulk of a the execution of a page-request ends up being php checking whether files exists.
Looking forward to seeing if you come up with something clever for (dlite) 2.0, and/or whether this mythical docker for Mac I'm signed up for (and almost nobody seems to have had access to?) has cracked the problem.
i'm working on implementing something similar to 9p but purpose built for dlite to ensure correct permissions. i have something that functions right now and am working on increasing performance. any examples you can provide of things that perform poorly with 9p would be appreciated, so i can test them against what i'm working on
Sure thing, I'll see if I can whip something quick (slow that is) up
@danquah I have some ideas for improving file system performance, which I think will help. I've been stuck in proof of concept mode though, as I've been really short on time the last month. No ETA right now, but I'll try to make some time.
I've done a simple test that has been packaged into an image that should be quite easy to run. The test just creates a lot of 1KB files runs a stat on them and reads them back in you should be able to run a test that uses a data-volume like this
docker run --rm -v ~/storage:/storage danquah/php-fileperf
You can find a simple script for running the test in the repo
The test took about twice as long to run under dlite2:
Dlite1:
Creating 10000 files 1KB each in /storage, total size 10,000KB
Created 10000 files in 11.88 ms
Got status of 10000 files in 1.08 ms
Read 10000 files in 2.49 ms
** Total duration: 15.45 **
Dlite 2:
Creating 10000 files 1KB each in /storage, total size 10,000KB
Created 10000 files in 16.21 ms
Got status of 10000 files in 4.25 ms
Read 10000 files in 17.4 ms
** Total duration: 37.86 **
And just as a baseline - if I let the file-creation happen inside the containers without using volumes we have no problem: Dlite 1
Creating 10000 files 1KB each in /storage, total size 10,000KB
Created 10000 files in 0.5 ms
Got status of 10000 files in 0.04 ms
Read 10000 files in 0.16 ms
** Total duration: 0.7 **
Dlite 2
Creating 10000 files 1KB each in /storage, total size 10,000KB
Created 10000 files in 0.76 ms
Got status of 10000 files in 0.04 ms
Read 10000 files in 0.1 ms
** Total duration: 0.9 **
I'll also try to do a test with a drupal-install - but in my first attempt I ran into permission-issues with dlite1 (drupal fails a deleting some files during install) - I'll be back :)
Got the Drupal install-test working. Runscript: https://raw.githubusercontent.com/danquah/docker-drupal-perf-test/master/run.sh
Results:
Dlite1, without volume
real 0m53.420s
user 0m4.200s
sys 0m1.330s
With volume
real 0m41.150s
user 0m5.220s
sys 0m33.560s
In other words - using a volume actually speeded up the install - weird.
Dlite 2 without volume
real 0m6.634s
user 0m3.900s
sys 0m0.660s
With a volume
real 0m16.701s
user 0m4.610s
sys 0m0.880s
So all of the sudden dlite2 is faster - even weirder when taking the previous test into consideration but obviously the installation is not hitting the same bottleneck as the basic file-creation test.
I have followed the instructions in this issue and I am getting a different error when trying out master of dlite, when I ssh into the vm I see this:
DhyveOS version 3.0.0
-sh: /etc/profile.d/dhyve.sh: docker: Text file busy
Not sure what to do about it, the docker command is completely unresponsive and the logs don't really show anything interesting.
2016/04/11 14:37:50 http: response.WriteHeader on hijacked connection
operation not supported by device
rdmsr to register 0x34 on vcpu 1
that message is because docker hasn't finished downloading in the vm. give it a minute and check again
Yes that was it, I wasn't patient enough. Thanks for all of your hard work on dlite!!!
this thread is getting suuuuuuper long so i'm going to close it. let's just open new issues to discuss bugs, and i'll open new issues when i add new features as well
Version 2.0.0 of DLite is ready for testing!
Here's what you can do to help:
First, remove your old installation of DLite
You'll also want to edit
/etc/exports
and remove the entry that DLite createdBuild from the latest code in the
master
branch (copy the binary to your path if you want. if you installed with homebrew before you'll want tobrew uninstall dlite
first) or download the latest pre-release binary on the releases page and install passing the-v
flag like so:After the installation completes, run
dlite start
and wait a minute or so. If your internet connection is slow and the version of docker requested in your config is not 1.10.2 it will take longer since on the first boot the docker binary gets downloaded, and it's over 30MB. You'll know it's done and running whendocker ps
works.Please report any issues you have here and I'll work to get them fixed up before the official release. Thanks!
Edit: You're also welcome to join the gitter for questions or just to say hi