nlf / dlite

The simplest way to use Docker on OS X
MIT License
2.34k stars 54 forks source link

Version 2.0.0 #135

Closed nlf closed 8 years ago

nlf commented 8 years ago

Version 2.0.0 of DLite is ready for testing!

Here's what you can do to help:

First, remove your old installation of DLite

dlite stop
sudo dlite uninstall
sudo nfsd stop #unless you're using nfsd outside of DLite

You'll also want to edit /etc/exports and remove the entry that DLite created

Build from the latest code in the master branch (copy the binary to your path if you want. if you installed with homebrew before you'll want to brew uninstall dlite first) or download the latest pre-release binary on the releases page and install passing the -v flag like so:

sudo dlite install -v 3.0.0-beta4

After the installation completes, run dlite start and wait a minute or so. If your internet connection is slow and the version of docker requested in your config is not 1.10.2 it will take longer since on the first boot the docker binary gets downloaded, and it's over 30MB. You'll know it's done and running when docker ps works.

Please report any issues you have here and I'll work to get them fixed up before the official release. Thanks!

Edit: You're also welcome to join the gitter for questions or just to say hi

bpinto commented 8 years ago

I'm also experiencing problems with permissions, but it's different from when we moved to root. The files permissions are set with my user permission bpinto:staff but the docker user is not being able to write to some files.

nlf commented 8 years ago

@antoniocanas can you tell me your user id and the group id of the 'staff' group from your osx host?

to get the group id, you should be able to run dscl . read /Groups/staff and see output similar to this

dscl . read /Groups/staff
AppleMetaNodeLocation: /Local/Default
GeneratedUID: ABCDEFAB-CDEF-ABCD-EFAB-CDEF00000014
GroupMembers: FFFFEEEE-DDDD-CCCC-BBBB-AAAA00000000
GroupMembership: root
Password: *
PrimaryGroupID: 20
RealName: Staff
RecordName: staff BUILTIN\Users
RecordType: dsRecTypeStandard:Groups
SMBSID: S-1-5-32-545

as for the user id, echo $UID should show that

nlf commented 8 years ago

@bpinto hmm.. i see the problem. let me see if i can come up with something to resolve that

nlf commented 8 years ago

@bpinto actually.. i haven't been able to replicate that, can you help me figure out how to replicate your issue so i can find a fix for it? i just tested files from my share that have 600 permissions meaning only my user (on the OSX host) has permissions to read or write it, and had no issues at all. what are you trying to do that's failing for you?

bpinto commented 8 years ago

@nlf I will try to create a reproducible scenario. I am running a rails server if you have rails installed and could run one server, it might have the same error.

bpinto commented 8 years ago

@nlf

~ ◦ $ la log/
total 8
drwxr-xr-x   3 bpinto  staff   102B Mar  7 17:16 .
drwxr-xr-x  42 bpinto  staff   1.4K Mar  5 19:24 ..
-rw-r--r--   1 bpinto  staff     2B Mar  7 17:16 development.log

~ ◦ $ docker run -t -v $PWD/log:/log ruby:2.1.6 /bin/bash -lc "ls -lah log; echo a >> log/development.log"
total 1.0K
drwxr-xr-x 1  501 dialout 102 Mar  7 17:16 .
drwxr-xr-x 1 root root    180 Mar  7 17:16 ..
-rw-r--r-- 1  501 dialout   2 Mar  7 17:16 development.log
/bin/bash: log/development.log: Operation not permitted

Oh 501:dialout... I hadn't noticed from inside the container the files had different permissions... Weird...

nlf commented 8 years ago

501:dialout is because while i map the user id between osx and the vm, nothing maps the user between the vm and the containers

nlf commented 8 years ago

ok, i'm able to reproduce it now. i'll see if i can come up with a fix

nlf commented 8 years ago

ok. so, we're now back to a point where write permissions within the container are a non issue. new files created in mounted volumes are owned by root in the host operating system, however. i'm still investigating to see if there's a better way to resolve that.

the docker binary is back to having the latest version pre-installed which makes for a faster initial boot unless you're requesting something other than the latest version of docker (or if a new docker release comes out and the image hasn't been rebuilt yet, i'm looking in to setting up a hook so the image will be kept up to date as well as possible)

also increased the size of the socket buffers, which should improve speed for things like docker logs

antoniocanas commented 8 years ago

@nlf It still shows drwxr-xr-x 1 501 dialout 170 Mar 8 15:46 app

The official mysql image, for example, will not work:

db_1 | 2016-03-09T11:12:19.857765Z 0 [ERROR] InnoDB: Operating system error number 1 in a file operation.
db_1 | 2016-03-09T11:12:19.857883Z 0 [ERROR] InnoDB: Error number 1 means 'Operation not permitted'
db_1 | 2016-03-09T11:12:19.857909Z 0 [ERROR] InnoDB: File ./ibdata1: 'stat' returned OS error 101.
db_1 | 2016-03-09T11:12:19.857922Z 0 [ERROR] InnoDB: os_file_get_status() failed on './ibdata1'. Can't determine file permissions
danquah commented 8 years ago

Yup, I'm getting the same error with the official mariadb image, it does seem its hitting a slightly different problem:

Directly from osx:

$ ls -ln /Users
total 0
drwxr-xr-x+  11 201 201  374 Nov  9 07:53 Guest
drwxrwxrwt    7   0   0  238 Nov  9 13:25 Shared
drwxr-xr-x+ 101 501  20 3434 Mar  9 15:25 danquah

From inside dhyveos as the docker user

$ ls -ln /Users
ls: /Users/danquah: Operation not permitted
total 0

And as root

$ sudo ls -ln /Users
total 4
drwxr-xr-x    1 501      20            3434 Mar  9 14:25 danquah

How does the permissions around the mountpoint work? Seems the docker user (uid 1000) is not allowed to even list the mounted volumne?

nlf commented 8 years ago

this file permissions thing is a real pain. i'm still trying to figure out the best way to resolve this. please do keep reporting specific use cases that fail, it gives me more things to test against while i try to find a solution.

danquah commented 8 years ago

@nlf just out of curiosity, what problem did the access=any + uname settings in https://github.com/nlf/dhyve-os/commit/0b1f7233e68d7e0cd5671c36b6e6801c05cabe19#diff-37aa5e55ffbda42c8c4d48f5feb1be5fL50 cause? I just remounted with those options and I'm getting a bit further. Files created from within dhyveos is now created as my user, but files created by a running container (even deeper within dhyveos I guess) are now appearing with uids from the container (in this case uid=999) which cases new kinds of problems.

nlf commented 8 years ago

it fixed some issues where permissions were being denied. i still haven't found a way to make everything happy.

danquah commented 8 years ago

hm, might be the problems I'm seeing now: while the mariadb container is able to create its mysql-directory with uid=999 any attempts at doing anything inside that directory fails. It seems that the uid=999 comes from explicit chowns done inside the container to make sure the mysql data-directory is owned by the mysql-user - and those ids make it all the way back to macos.

Files created without trying to force a uid/gid seems to be created as as my local mac-user - so that part at least works.

Any idea where the mapping between the hostos' file-system uids/guids and the fs presented to the container happens?

nlf commented 8 years ago

right now there's no mapping. i'm still trying to figure out if there's a safe way of forcibly mapping everyone to the host's user

ruimarinho commented 8 years ago

What about --userns-remap? Would that be of any help here? E.g. https://github.com/boot2docker/boot2docker/pull/1142

danquah commented 8 years ago

As I understand docker and user namespaces it won't help here. Namespacing allows the docker deamon to map uids/gids used inside a container to another set of id's outside the container. eg, for container 1 uid 0 is mapped to 10000 in the hostos (uid 1000 to 11000 and so forth), and for container 2 uid 0 is mapped to 20000, uid 1000 to 21000.

What would be nice in the dlite-case (as I understand it) would be to squash everything to the macos user running dlite. The problem then is the world view being presented to the container. It would probably see uid/gids from the hostos instead of its own, and that might cause problems if it tries to enforce a specific uid/gid.

nlf commented 8 years ago

correct, user namespaces won't resolve the issue here

nlf commented 8 years ago

and double correct, squashing all permissions to the host uid/gid is problematic

antoniocanas commented 8 years ago

So, right now, there's no way to use Docker on OSX, is it? dinghy, docker-machine-nfs or any other solution have the same permissions problem.

nlf commented 8 years ago

i just spent a bunch of time hacking up the 9p library to allow for different ownership on the host than there is in the guest, and it seems to be working.

by that, i mean i was able to run the mysql container pointing to a directory on my host. files were all created on the host as my user. files in the container were owned by mysql:staff which seemed to be good enough to make mysql happy.

i'll have a new test build up soon

danquah commented 8 years ago

Awesome! Be happy to test it of cause :)

nlf commented 8 years ago

ok folks, beta4 of 2.0.0 is up and should resolve at least some of the permissions issues between host and guest.

what i did was hack the 9p filesystem support in xhyve to do a few things differently:

this release also takes a shot at resolving the weird buffering issues for some commands, like logs -f

as a heads up, in my above example where chown mysql:mysql /var/lib/mysql is done, the ownership in the guest will appear as mysql:staff. this is because 9p doesn't request a GID when the filesystem is attached so i have no way of determining the correct group to map to, which means i fall back to the primary group of the host's user. so far from my limited testing this is only cosmetic as the majority of services don't particularly care about group permissions.

as always, please post feedback you have whether it's positive or negative. it's all helpful :)

antoniocanas commented 8 years ago

Thanks for your effort :)

Testing it, docker is not starting:

~/tmp
❯ sudo ./dlite uninstall
Removing launchd agent: done
Removing files: done

~/tmp
❯ sudo ./dlite install -v 3.0.0-beta5
...
Would you like to continue? (Y/n):
Building disk image: done
Downloading OS: done
Generating SSH key: done
Writing configuration: done
Creating launchd agent: done

~/tmp 40s
❯ ./dlite start
Starting the agent: done
The VM may take some additional time to fully boot

~/tmp
❯ date
Mon Mar 14 21:52:20 CET 2016

~/tmp
❯ docker ps
Cannot connect to the Docker daemon. Is the docker daemon running on this host?

~/tmp
❯ date
Mon Mar 14 21:57:08 CET 2016

~/tmp
❯ docker ps
Cannot connect to the Docker daemon. Is the docker daemon running on this host?
nlf commented 8 years ago

yeah dhyve-os beta5 was a bust, sorry about that. you should be able to use beta4 and i already took down beta5

antoniocanas commented 8 years ago

Same problem with beta4

nlf commented 8 years ago

you did a full uninstall reinstall cycle? let me see if i can replicate here

nlf commented 8 years ago

i just did the following:

dlite stop
sudo dlite uninstall
sudo dlite install -v v3.0.0-beta4
dlite start
# wait ~30 seconds
docker ps

with success, maybe you're having another issue. are you able to ssh to the vm with dlite ssh?

antoniocanas commented 8 years ago

Edit: reinstalled (after reboot) again and now it's working :)

nlf commented 8 years ago

are you able to ssh to the vm? dlite ssh should get you in

antoniocanas commented 8 years ago

Yeah it's working now, sorry. Again, same permissions problem: from osx host: drwxr-xr-x 23 antonio staff 782B Mar 14 22:31 app from container: drwxr-xr-x 1 501 dialout 782 Mar 14 21:31 app

danquah commented 8 years ago

Think I had the same problem, I fixed it by doing a dlite config and changing the dns_server to 8.8.8.8. For some reason dhyveos could not do any lookups, and it resulted in a 0-byte /usr/bin/docker

Would it be possible to run some kind of diagnostics (eg "dlite verify") after the deamon is up and running that could check things like this?

I have docker up and running now, and the mariadb container I'm figthing with is now creating files and directories owned by my mac user, and from inside the container the same files shows up owned by mysql:mysql.

It still runs in to some permission problems though, I'll debug som more.

nlf commented 8 years ago

it will always show 501 dialout as the permissions for files in your user directory, because that's who they belong to. there's no user with id 501 in the container, so that gets displayed numerically, however there is a group with id 20 named dialout so you see that one as a string.

the changes i made allows things like chown to work correctly as far as the vm is concerned, for example in your container chown root:root a file at random and you'll see in the container it shows as owned by root (though the group will still be dialout), however on your host it will still be owned by your user

nlf commented 8 years ago

@danquah yeah a verify or troubleshoot command or something is a good idea

what permission problems are you seeing it run into? i was able to run a mariadb container though admittedly i never actually created a table or wrote any data to it.

antoniocanas commented 8 years ago

So do I need to inherit each image Dockerfile's and 'chown' the volume?

nlf commented 8 years ago

no, you just need to use the files. the display of 501 dialout is only cosmetic. your containers should all work fine.

antoniocanas commented 8 years ago

I see, the UID is correct. Thanks a lot :)

danquah commented 8 years ago

@nlf it gets started, populates its datadir with some basic files, and then fails.

To reproduce:

mkdir ~/dbtest
docker run -i --name mariadb-test -v ~/dbtest:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mariadb:5
docker logs mariadb-test 

Where I amongst other messages found

Initializing database
160314 22:39:14 [Note] /usr/sbin/mysqld (mysqld 5.5.48-MariaDB-1~wheezy) starting as process 59 ...
ERROR: 1005  Can't create table 'db' (errno: 13)
160314 22:39:14 [ERROR] Aborting

Permissions from a container:

$ docker run -it -v ~/dbtest:/var/lib/mysql mariadb:5 ls -l /var/lib/mysql
total 17
-rw-r----- 1 mysql dialout 16384 Mar 14 22:39 aria_log.00000001
-rw-r----- 1 mysql dialout    52 Mar 14 22:39 aria_log_control
drwx------ 1 root  dialout    68 Mar 14 22:39 mysql

And from osx

$ ls -l ~/dbtest
total 20
-rw-r----- 1 danquah staff 16384 Mar 14 23:39 aria_log.00000001
-rw-r----- 1 danquah staff    52 Mar 14 23:39 aria_log_control
drwx------ 2 danquah staff    68 Mar 14 23:39 mysql

Don't know yet whether that mysql-directory is supposed to be owned by root, might be the problem ....

nlf commented 8 years ago

yeah i kind of suspect that's the case.. i'm not sure why mariadb wouldn't chown the directory itself though. i'll take a look.

danquah commented 8 years ago

Ok, this is wird, it seems that I can't fix the ownership manually:

docker run -i --rm -v ~/dbtest:/var/lib/mysql mariadb:5 /bin/bash -c "ls -l /var/lib/mysql; chown -v mysql /var/lib/mysql/mysql; ls -l /var/lib/mysql"

total 17
-rw-r----- 1 mysql dialout 16384 Mar 15 08:40 aria_log.00000001
-rw-r----- 1 mysql dialout    52 Mar 15 08:40 aria_log_control
drwx------ 1 root  dialout    68 Mar 15 08:40 mysql
changed ownership of `/var/lib/mysql/mysql' from root to mysql
total 17
-rw-r----- 1 mysql dialout 16384 Mar 15 08:40 aria_log.00000001
-rw-r----- 1 mysql dialout    52 Mar 15 08:40 aria_log_control
drwx------ 1 root  dialout    68 Mar 15 08:40 mysql

Quick script to replicate:

#!/usr/bin/env bash -x
docker rm mariadb-5-init

rm -fr ~/dbtest
mkdir ~/dbtest
docker run -i --name mariadb-5-init -v ~/dbtest:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw mariadb:5
docker run -i --rm -v ~/dbtest:/var/lib/mysql mariadb:5 /bin/bash -c "ls -l /var/lib/mysql; chown -v mysql /var/lib/mysql/mysql; ls -l /var/lib/mysql"
antoniocanas commented 8 years ago

Didn't look inside the container, but the official mysql is working out of the box for me

danquah commented 8 years ago

@antoniocanas - good point, I just did the same test with mysql:latest and it does indeed work. Directory-listing after the container came up

-rw-r----- 1 mysql mysql         56 Mar 15 09:20 auto.cnf
-rw-r----- 1 mysql mysql       1319 Mar 15 09:21 ib_buffer_pool
-rw-r----- 1 mysql mysql   50331648 Mar 15 09:21 ib_logfile0
-rw-r----- 1 mysql mysql   50331648 Mar 15 09:20 ib_logfile1
-rw-r----- 1 mysql mysql   79691776 Mar 15 09:21 ibdata1
-rw-r----- 1 mysql dialout 12582912 Mar 15 09:21 ibtmp1
drwxr-x--- 1 mysql mysql       2618 Mar 15 09:20 mysql
drwxr-x--- 1 mysql mysql       3060 Mar 15 09:20 performance_schema
drwxr-x--- 1 mysql mysql       3672 Mar 15 09:20 sys

The problem seems to be that once a file has a ownership, it cannot be changed:

root@5a9a1c0d470f:/var/lib/mysql# touch /var/lib/test_outside_volume

root@5a9a1c0d470f:/var/lib/mysql# ls -l /var/lib/test_outside_volume
-rw-r--r-- 1 root root 0 Mar 15 09:26 /var/lib/test_outside_volume

root@5a9a1c0d470f:/var/lib/mysql# chown -v mysql /var/lib/test_outside_volume
changed ownership of '/var/lib/test_outside_volume' from root to mysql

root@5a9a1c0d470f:/var/lib/mysql# ls -l /var/lib/test_outside_volume
-rw-r--r-- 1 mysql root 0 Mar 15 09:26 /var/lib/test_outside_volume

root@5a9a1c0d470f:/var/lib/mysql# touch /var/lib/mysql/test_inside_volume

root@5a9a1c0d470f:/var/lib/mysql# ls -l /var/lib/mysql/test_inside_volume
-rw-r--r-- 1 root dialout 0 Mar 15 09:31 /var/lib/mysql/test_inside_volume

root@5a9a1c0d470f:/var/lib/mysql# chown -v mysql /var/lib/mysql/test_inside_volume
changed ownership of '/var/lib/mysql/test_inside_volume' from root to mysql

root@5a9a1c0d470f:/var/lib/mysql# ls -l /var/lib/mysql/test_inside_volume
-rw-r--r-- 1 root dialout 0 Mar 15 09:31 /var/lib/mysql/test_inside_volume

Did the same test for at file owned by mysql trying to change the ownership back to root, same result.

antoniocanas commented 8 years ago

Disk size argument it's not working:

~/tmp
❯ sudo ./dlite install -v v3.0.0-beta4 --disk=100
Password:
The install command will make the following changes to your system:
- Create a '.dlite' directory in your home
- Create a 100 GiB sparse disk image in the '.dlite' directory
...

~/tmp 16s
❯ ./dlite start
Starting the agent: done
The VM may take some additional time to fully boot

~/tmp
❯ ssh docker@local.docker
Warning: Permanently added '192.168.64.9' (ECDSA) to the list of known hosts.
DhyveOS version 3.0.0
Docker version 1.10.3, build 20f81dd
$ df -h
Filesystem                Size      Used Available Use% Mounted on
devtmpfs                995.2M         0    995.2M   0% /dev
tmpfs                  1001.9M         0   1001.9M   0% /dev/shm
tmpfs                  1001.9M     40.0K   1001.9M   0% /tmp
tmpfs                  1001.9M     36.0K   1001.9M   0% /run
/dev/sda1                49.1M     34.2M     11.1M  75% /mnt/overlay
overlay                  49.1M     34.2M     11.1M  75% /etc
overlay                  49.1M     34.2M     11.1M  75% /usr/bin
/dev/sda3                19.0G   1003.7M     17.8G   5% /var/lib/docker
none                   1001.9M         0   1001.9M   0% /sys/fs/cgroup
/dev/sda3                19.0G   1003.7M     17.8G   5% /var/lib/docker/btrfs

I've tried to create a bigger disk because PHP's Composer throws this The disk hosting /app/vendor is full, this may be the cause of the following exception

I guess space is not the problem, because I can dd a large zero-filled file, probably PHP is getting the amount of free space and fail for a sparsed disk.

It's a volume shared with the host, my docker-compose.yml:

app:
  image: tianon/true
  volumes:
    - ./app:/app
nlf commented 8 years ago

@antoniocanas hmm.. i'm unable to reproduce this one. does it work for you if you do:

dlite stop
dlite rebuild -d 100
dlite start
nlf commented 8 years ago

@danquah checking it out to see if i can reproduce it on my end, thanks for the detailed example

nlf commented 8 years ago

@danquah so here's a weird thing.. chown mysql /var/lib/mysql/mysql fails, but chown mysql:mysql /var/lib/mysql/mysql works fine. i'm adding some debugging on my end so i can try to figure out what mariadb is doing to see if i can fix it

antoniocanas commented 8 years ago

@nlf I also did that and same problem. I'll try at home later :)

nlf commented 8 years ago

when you get home, after you run dlite stop but before you try to rebuild, check the output of hdiutil info and see if there's an entry for disk.sparseimage in there. i know the rebuild doesn't work correctly if the old disk didn't get unmounted somehow, so possibly there's a bug there

nlf commented 8 years ago

@danquah fixed :) check out 2.0.0-beta5