microsoft / mssql-docker

Official Microsoft repository for SQL Server in Docker resources
MIT License
1.71k stars 756 forks source link

ZFS Support #13

Open au-phiware opened 7 years ago

au-phiware commented 7 years ago

Upon executing docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=<YourStrong!Passw0rd>' -p 1433:1433 microsoft/mssql-server-linux, I encounter the following message:

Configuring Microsoft(R) SQL Server(R)...
Microsoft(R) SQL Server(R) setup failed with error code 1. Please check the setup log in /var/opt/mssql/log for more information.

The setup log contains:

This is an evaluation version.  There are [154] days left in the evaluation period.
2017-01-10 23:55:24.82 Server      Setup step is copying system data file 'C:\templatedata\master.mdf' to 'C:\var\opt\mssql\data\master.mdf'.
2017-01-10 23:55:24.90 Server      Setup step is copying system data file 'C:\templatedata\mastlog.ldf' to 'C:\var\opt\mssql\data\mastlog.ldf'.
2017-01-10 23:55:24.91 Server      Setup step is copying system data file 'C:\templatedata\model.mdf' to 'C:\var\opt\mssql\data\model.mdf'.
2017-01-10 23:55:24.92 Server      Setup step is copying system data file 'C:\templatedata\modellog.ldf' to 'C:\var\opt\mssql\data\modellog.ldf'.
2017-01-10 23:55:24.94 Server      Setup step is copying system data file 'C:\templatedata\msdbdata.mdf' to 'C:\var\opt\mssql\data\msdbdata.mdf'.
2017-01-10 23:55:24.95 Server      Setup step is copying system data file 'C:\templatedata\msdblog.ldf' to 'C:\var\opt\mssql\data\msdblog.ldf'.
2017-01-10 23:55:25.02 Server      Microsoft SQL Server vNext (CTP1.1) - 14.0.100.187 (X64) 
    Dec 10 2016 02:51:11 
    Copyright (C) 2016 Microsoft Corporation. All rights reserved.
    on Linux (Ubuntu 16.04.1 LTS)
2017-01-10 23:55:25.02 Server      UTC adjustment: 0:00
2017-01-10 23:55:25.02 Server      (c) Microsoft Corporation.
2017-01-10 23:55:25.03 Server      All rights reserved.
2017-01-10 23:55:25.03 Server      Server process ID is 4116.
2017-01-10 23:55:25.03 Server      Logging SQL Server messages in file 'C:\var\opt\mssql\log\errorlog'.
2017-01-10 23:55:25.03 Server      Registry startup parameters: 
     -d C:\var\opt\mssql\data\master.mdf
     -l C:\var\opt\mssql\data\mastlog.ldf
     -e C:\var\opt\mssql\log\errorlog
2017-01-10 23:55:25.03 Server      Command Line Startup Parameters:
     --setup
     --sa-password
2017-01-10 23:55:25.04 Server      Error: 17113, Severity: 16, State: 1.
2017-01-10 23:55:25.04 Server      Error 87(The parameter is incorrect.) occurred while opening file 'C:\var\opt\mssql\data\master.mdf' to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary.
2017-01-10 23:55:25.05 Server      SQL Server shutdown has been initiated

The /var/opt/mssql/data directory contains:

-rw-r----- 1 root root  4194304 Jan 11 10:55 master.mdf
-rw-r----- 1 root root   524288 Jan 11 10:55 mastlog.ldf
-rw-r----- 1 root root  8388608 Jan 11 10:55 modellog.ldf
-rw-r----- 1 root root  8388608 Jan 11 10:55 model.mdf
-rw-r----- 1 root root 13959168 Jan 11 10:55 msdbdata.mdf
-rw-r----- 1 root root   524288 Jan 11 10:55 msdblog.ldf

Please advise.

au-phiware commented 7 years ago
uname -a
Linux tecknack-corin 4.6.5-stable #54 SMP Tue Sep 13 11:33:24 AEST 2016 x86_64 Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz GenuineIntel GNU/Linux
docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        
 OS/Arch:      linux/amd64
twright-msft commented 7 years ago

@au-phiware I tried running the exact command that you provided and it worked ok for me. In my case I was using Docker for Windows, but I've done essentially the same thing recently on RHEL and Docker for Mac. I'm wondering - is that the exact command you used or did you possibly try mounting a volume using docker run -v?

au-phiware commented 7 years ago

Hi @twright-msft, you are right I did run it with -v initially (that's how I got the log contents) but before posting I made sure to run that exact command. I tried a number of variations, all had exactly the same output, and I'm assuming the setup log is the same too. So, I could be wrong, is it likely that the log would be different? I could copy the file out if you suspect there's something in that...?

au-phiware commented 7 years ago

I have no doubt that there must be something particular about my kernel or filesystem. I use Gentoo and build my own kernel from git (the stable branch) but I'm not doing anything special there. I use zfs and so the docker storage driver is zfs too, could it be that there's an issue with accessing metadata or something else to do with the filesystem?

au-phiware commented 7 years ago

Just to be sure I tried on my home machine (also Gentoo and ZFS) and used docker cp to retrieve the errorlog file. Here's what happened:

$ docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=<YourStrong!Passw0rd>' -p 1433:1433 microsoft/mssql-server-linux
Unable to find image 'microsoft/mssql-server-linux:latest' locally
latest: Pulling from microsoft/mssql-server-linux
aed15891ba52: Pull complete 
773ae8583d14: Pull complete 
d1d48771f782: Pull complete 
cd3d6cd6c0cf: Pull complete 
8ff6f8a9120c: Pull complete 
1fd7e8b10447: Pull complete 
bd485157db89: Pull complete 
273a1970ce9c: Pull complete 
d2d0b4dc209f: Pull complete 
Digest: sha256:f17b5a6d31f2e863581ba91169641a21f05584d9e5f82e43ee066c307e5550af
Status: Downloaded newer image for microsoft/mssql-server-linux:latest
Configuring Microsoft(R) SQL Server(R)...
Microsoft(R) SQL Server(R) setup failed with error code 1. Please check the setup log in /var/opt/mssql/log for more information.
$ docker ps -a | head -n 2
CONTAINER ID        IMAGE                               COMMAND                  CREATED             STATUS                      PORTS               NAMES
d66b9878ea41        microsoft/mssql-server-linux        "/bin/sh -c /opt/m..."   14 minutes ago      Exited (1) 14 minutes ago                       nervous_engelbart
$ docker cp d66b9878ea41:/var/opt/mssql/log/errorlog .
$ cat errorlog
ÿþ2017-01-11 20:59:36.77 Server      Microsoft SQL Server vNext (CTP1.1) - 14.0.100.187 (X64) 
    Dec 10 2016 02:51:11 
    Copyright (C) 2016 Microsoft Corporation. All rights reserved.
    on Linux (Ubuntu 16.04.1 LTS)
2017-01-11 20:59:36.78 Server      UTC adjustment: 0:00
2017-01-11 20:59:36.78 Server      (c) Microsoft Corporation.
2017-01-11 20:59:36.78 Server      All rights reserved.
2017-01-11 20:59:36.78 Server      Server process ID is 4116.
2017-01-11 20:59:36.78 Server      Logging SQL Server messages in file 'C:\var\opt\mssql\log\errorlog'.
2017-01-11 20:59:36.78 Server      Registry startup parameters: 
     -d C:\var\opt\mssql\data\master.mdf
     -l C:\var\opt\mssql\data\mastlog.ldf
     -e C:\var\opt\mssql\log\errorlog
2017-01-11 20:59:36.78 Server      Command Line Startup Parameters:
     --setup
     --sa-password
2017-01-11 20:59:36.79 Server      Error: 17113, Severity: 16, State: 1.
2017-01-11 20:59:36.79 Server      Error 87(The parameter is incorrect.) occurred while opening file 'C:\var\opt\mssql\data\master.mdf' to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary.
2017-01-11 20:59:36.80 Server      SQL Server shutdown has been initiated
$ docker version
Client:
 Version:      1.13.0-rc3
 API version:  1.25
 Go version:   go1.7.1
 Git commit:   4d92237
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.0-rc3
 API version:  1.25 (minimum version 1.12)
 Go version:   go1.7.1
 Git commit:   4d92237
 Built:        
 OS/Arch:      linux/amd64
 Experimental: false
$ uname -a
Linux fibonacci 4.3.6-stable #61 SMP Sun Dec 18 14:11:45 AEDT 2016 x86_64 Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz GenuineIntel GNU/Linux
twright-msft commented 7 years ago

We havent done any testing on Gentoo/zfs yet. It's definitely possible there is an issue there with the kernel/file system.

au-phiware commented 7 years ago

Is there any further information that I can provide? Or turn on verbose logging?

On Fri, 13 Jan 2017, 12:30 AM Travis Wright notifications@github.com wrote:

We havent done any testing on Gentoo/zfs yet. It's definitely possible there is an issue there with the kernel/file system.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Microsoft/mssql-docker/issues/13#issuecomment-272163319, or mute the thread https://github.com/notifications/unsubscribe-auth/AAx2LutMzCdMoMYWB_5SqhkOvDcjEVLOks5rRirqgaJpZM4LgDrC .

rouke-broersma commented 7 years ago

I am having this exact same issue on Docker for windows.

twright-msft commented 7 years ago

One thing to check, especially if you are using Docker for Windows or Docker for Mac is that you adjust the amount of RAM that Docker Engine has available to it.
https://docs.docker.com/docker-for-mac/#/advanced https://docs.docker.com/docker-for-windows/#/advanced

rouke-broersma commented 7 years ago

The mssql image shows a clear and specific error message when docker engine has less than 4gb ram available. This is not the error I was getting. After a full docker reset the issue seems to have gone away now though.

laurentleseigneur commented 7 years ago

having same issue and error message using mac and docker v1.13 with 4gb memory

when mounting volume with -v option to bind /var/opt/mssql, we can see logs but .mdf id failing with error 87.

workaround is to start without this -v option, it works, but .mdf file are inside container

twright-msft commented 7 years ago

@laurentleseigneur This is a known issue with macOS and -v. We have a separate issue tracking that. https://github.com/Microsoft/mssql-docker/issues/12
For now, you can use an attached Docker container volume for database file storage. https://docs.docker.com/engine/tutorials/dockervolumes/#/creating-and-mounting-a-data-volume-container

laurentleseigneur commented 7 years ago

thanks @twright-msft for pointing this known issue, i will give a try to a docker compose using a container volume to store .mdf files

sirio3mil commented 7 years ago

I had the same error and I solved it by erasing the images and containers and re-creating them.

Docker version 1.13.1, build 092cba3 Windows 10 build 15031.rs2_release.170204-1546

PatrickDePuydt commented 7 years ago

@twright-msft Thanks for this thread; came here because I'm experiencing the same issue. I wanted to see if my case is a part of the known issue because I'm using mounted data volume containers.

In my docker-compose.yml I have a service that is configured:

  mssql:
    build: ./mssql
    env_file:
    - .env
    ports:
    - "1433:1433"
    volumes:
    - ./mssql/volumes/data:/var/opt/mssql

But when I pull up the logs from docker logs <container iD> It spits out:

2017-03-08 20:01:34.80 Server      Error: 17113, Severity: 16, State: 1.
2017-03-08 20:01:34.80 Server      Error 87(The parameter is incorrect.) occurred while opening file 'C:\var\opt\mssql\data\master.mdf' to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary.
2017-03-08 20:01:34.80 Server      SQL Server shutdown has been initiated

Is this part of the the known issue?

twright-msft commented 7 years ago

@PatrickDePuydt Are you on MacOS or Linux? If Linux, what distribution/version are you running on?

PatrickDePuydt commented 7 years ago

@twright-msft I'm running OS X Sierra but the base Docker image is a Centos7

twright-msft commented 7 years ago

@PatrickDePuydt OK, then yes, you are running into the issue I described above with using Docker for Mac and -v (volumes: in the way you are using it in docker-compose.yml is effectively docker run -v). Please use a 'data volume container' instead. More info: https://docs.docker.com/engine/tutorials/dockervolumes/#creating-and-mounting-a-data-volume-container

http://stackoverflow.com/questions/32908621/how-can-i-create-a-data-container-only-using-docker-compose-yml

PatrickDePuydt commented 7 years ago

@twright-msft Ok, thanks! Could you clear up one last thing? You mentioned using an attached Docker container volume for database file storage as a workaround, but isn't that what I'm doing in the volume declaration of the service: - ./mssql/volumes/data:/var/opt/mssql ?

twright-msft commented 7 years ago

As I understand it what you are doing there is mounting a directory on the host to a directory in the container. That's what doesnt work on macOS. What you want to do is mount a volume container to a directory in the mssql container.

somashekarg commented 7 years ago

This is an evaluation version. There are [156] days left in the evaluation period. RegQueryValueEx HADR for key "Software\Microsoft\Microsoft SQL Server\MSSQL\MSSQLServer\HADR" failed. 2017-03-11 09:58:56.66 Server Microsoft SQL Server vNext (CTP1.3) - 14.0.304.138 (X64) Feb 13 2017 16:49:12 Copyright (C) 2016 Microsoft Corporation. All rights reserved. on Linux (Ubuntu 16.04.1 LTS) 2017-03-11 09:58:56.66 Server UTC adjustment: 0:00 2017-03-11 09:58:56.66 Server (c) Microsoft Corporation. 2017-03-11 09:58:56.66 Server All rights reserved. 2017-03-11 09:58:56.67 Server Server process ID is 4116. 2017-03-11 09:58:56.67 Server Logging SQL Server messages in file 'C:\var\opt\mssql\log\errorlog'. 2017-03-11 09:58:56.67 Server Registry startup parameters: -d C:\var\opt\mssql\data\master.mdf -l C:\var\opt\mssql\data\mastlog.ldf -e C:\var\opt\mssql\log\errorlog 2017-03-11 09:58:56.69 Server Error: 17113, Severity: 16, State: 1. 2017-03-11 09:58:56.69 Server Error 3(The system cannot find the path specified.) occurred while opening file 'C:\var\opt\mssql\data\master.mdf' to obtain configuration information at startup. An invalid startup option might have caused the error. Verify your startup options, and correct or remove them if necessary. 2017-03-11 09:58:56.70 Server SQL Server shutdown has been initiated

I have the same problem but not in docker. it running and working in docker after pushing the docker image to heroku app it is giving the above message message. anybody has any idea about this issue. Please help if anybody knows

Thanks

au-phiware commented 7 years ago

(I feels as if this thread has been hijacked by Mac users)

I decided to take ZFS out of the mix:

  1. First I confirmed I was still receiving the error with the latest image.
  2. I create a zvol and formatted it with ext4 and mounted it as my /var/lib/docker directory.
  3. Started the docker daemon and ensured it wasn't using zfs as the storage device.
  4. Tried again.

This time I didn't receive the error, the container stayed up and I have an operational mssql server on linux!

I guess the next question is: where does the fault lie? This project (seems unlikely)? Docker (more specifically the zfs storage driver)? Or ZFS?

I think I'll close this issue and go ask Docker.

For the record here's my working configuration:

$ docker info
Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 1.12.1
Storage Driver: devicemapper
 Pool Name: docker-230:1-131073-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: ext4
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 2.485 GB
 Data Space Total: 107.4 GB
 Data Space Available: 7.923 GB
 Metadata Space Used: 2.257 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.145 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.93 (2015-01-30)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.6.5-stable
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.59 GiB
Name: tecknack-corin
ID: JXJV:BP5E:T547:PZTF:4JSO:6SV5:B7EO:XA3U:EPV7:7Z5H:BS2Y:SAV7
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
 127.0.0.0/8
gigatexal commented 7 years ago

Have you run any traces while it was running and then capturing the crash?

On May 14, 2017 at 7:23 PM, <Corin Lawson (mailto:notifications@github.com)> wrote:

(I feels as if this thread has been hijacked by Mac users)

I decided to take ZFS out of the mix:

First I confirmed I was still receiving the error with the latest image.

I create a zvol and formatted it with ext4 and mounted it as my /var/lib/docker directory.

Started the docker daemon and ensured it wasn't using zfs as the storage device.

Tried again.

This time I didn't receive the error, the container stayed up and I have an operational mssql server on linux!

I guess the next question is: where does the fault lie? This project (seems unlikely)? Docker (more specifically the zfs storage driver)? Or ZFS?

I think I'll close this issue and go ask Docker.

For the record here's my working configuration:

$ docker info Containers: 1 Running: 1 Paused: 0 Stopped: 0 Images: 2 Server Version: 1.12.1 Storage Driver: devicemapper Pool Name: docker-230:1-131073-pool Pool Blocksize: 65.54 kB Base Device Size: 10.74 GB Backing Filesystem: ext4 Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 2.485 GB Data Space Total: 107.4 GB Data Space Available: 7.923 GB Metadata Space Used: 2.257 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.145 GB Thin Pool Minimum Free Space: 10.74 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data WARNING: Usage of loopback devices is strongly discouraged for production use. Use --storage-opt dm.thinpooldev to specify a custom block storage device. Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.93 (2015-01-30) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host null overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 4.6.5-stable Operating System: Gentoo/Linux OSType: linux Architecture: x86_64 CPUs: 4 Total Memory: 15.59 GiB Name: tecknack-corin ID: JXJV:BP5E:T547:PZTF:4JSO:6SV5:B7EO:XA3U:EPV7:7Z5H:BS2Y:SAV7 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Insecure Registries: 127.0.0.0/8

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub (https://github.com/Microsoft/mssql-docker/issues/13#issuecomment-301360919), or mute the thread (https://github.com/notifications/unsubscribe-auth/AAk8rXrpPZxLV7L8WbCyd5L3NKHVINDvks5r57cqgaJpZM4LgDrC).

Mic92 commented 7 years ago

Here is a full syscall trace from machine with zfsonlinux:

Steps to reproduce

  1. docker run -ti -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=<YourStrong!Passw0rd>' -p 1433:1433 microsoft/mssql-server-linux bash
  2. root@f5f756e8d68c:/# apt update && apt install strace
  3. strace -f -o /tmp/sqlservr.log /opt/mssql/bin/sqlservr

https://dl.thalheim.io/Z03MnoPGwbCXhJOXpBvgwg/sqlservr.log?dl=1

I tried to reason about the repeated sequence of syscalls (some strange enterprise retry handler), but found not failing syscall. Without access to the source code and the uninformative error message, I have no clue what's going on here.

twright-msft commented 7 years ago

FWIW we have done zero testing on zfs. @au-phiware Please let us know the issue ID if you do raise this with Docker in a way that is publicly trackable. Thanks!

au-phiware commented 7 years ago

The issue can be viewed at moby/moby#33191

Mic92 commented 7 years ago

@au-phiware I do not think docker is the best issue tracker for this, as we miss domain knowledge about mssql internals.

au-phiware commented 7 years ago

@Mic92 yes, well I'm really not sure where to go... I'll try recreate the issue with docker removed from the mix.

Is there a better issue tracker someplace or should I reopen this one?

Mic92 commented 7 years ago

this one is probably better for the moment.

au-phiware commented 7 years ago

Okay, I have installed mssql-server onto my system and encounter the same error. Mounting an ext4 filesystem at /var/opt/mssql resolves the error.

I realise that only xfs and ext4 is supported. So, I guess this is a feature request?

At least I have a work around now; I look forward to giving it a spin and adding it to my workflow.

crazybert commented 6 years ago

Same here. Ubuntu + ZFS + LXC.

twright-msft commented 6 years ago

@crazybert - Please use Ubuntu 16.04 + ext4 + Docker for now. Keeping this issue open only as an 'enhancement' tracking issue for us to consider in a future release.

crazybert commented 6 years ago

@twright-msft Thanks! Also appeared to work with ext4 on zvol. Not ideal, but good enough.

t-oster commented 6 years ago

Does anyone know what exactly is the missing feature of ZFS, what stops mssql from working? Maybe we just have to enable something like ACLs or similar. ZVOLs are an option but not ideal. I would prefer a normal ZFS filesystem much more.

t-oster commented 6 years ago

looks like the O_DIRECT support is missing in ZFS https://github.com/zfsonlinux/zfs/issues/224...

Mic92 commented 6 years ago

a quick hack would be to have a seccomp profile for the container, that masks O_DIRECT in open calls. This would only make things slightly less performant as data would be cached both by ZFS and mssql.

Mic92 commented 6 years ago

But to be honest, I would be better if mssql would use POSIX_FADV_DONTNEED instead of O_DIRECT.

t-oster commented 6 years ago

Yes, I found something similar https://code.uplex.de/liblongpath/liblongpath/commit/2e46a921ce2b6b1caa56d39cbd58be85c5988bd0 but I did not get it to work yet

t-oster commented 6 years ago

@Mic92 could you post how exactly one would set up such a seccomp profile?

Mic92 commented 6 years ago

Turns out seccomp alone is not enough without using ptrace. However this is quite complicated. LD_PRELOAD is much easier to use:

// nodirect_open.c 
#define _GNU_SOURCE
#include <dlfcn.h>
#include <fcntl.h>
typedef int (*orig_open_f_type)(const char *pathname, int flags);
int open(const char *pathname, int flags, ...) {
    static orig_open_f_type orig_open;
    if (!orig_open) {
                        orig_open = (orig_open_f_type)dlsym(RTLD_NEXT, "open");
    }
    return orig_open(pathname, flags & ~O_DIRECT);
}
$ apt update gcc && apt install gcc
$ gcc -shared -fpic -o /nodirect_open.so nodirect_open.c -ldl
$ echo /nodirect_open.so > /etc/ld.so.preload
$ /opt/mssql/bin/sqlservr --accept-eula

I had to delete the old database in /var/opt/mssql before

t-oster commented 6 years ago

Thank you! I am not into the filesystem stuff, so can you tell me what effect this has, or more specifically does using this hack (over ZFS) affect stability or performance of the SQL Server in any way?

Mic92 commented 6 years ago

@t-oster O_DIRECT is this dirty hack that database guys introduced to bypass any caches for read/writes in the operating system. Reason for that is that it would result in double caching: both the operating system and the database cache data. However databases have in general a better idea what data needs to be cached. In the worst case your caches would be only half as efficient. In practice however it is not that bad: If the system is under memory pressure it will drop the page cache, while the database continues to keep its own. That means that your operating system will waste some CPU cycles on maintaining unnecessary page and write back caches.

Mic92 commented 6 years ago

Also read this thread on the linux developer kernel mailing list: https://lkml.org/lkml/2007/1/10/231

t-oster commented 6 years ago

so I would use zfs set primaraycache=metadata and be happy without concerns about other drawbacks than maybe performance?

Mic92 commented 6 years ago

In regards to O_DIRECT I don't see other drawbacks. I have not tested mssql intensively to see what it does elsewhere.

t-oster commented 6 years ago

Thank you @Mic92 it seems to work very good, maybe I will switch our production system from ext4/zvol directly to zfs (after some tests). If anyone else needs this, here you are: https://github.com/t-oster/mssql-docker-zfs

Spongman commented 6 years ago

i just ran into this today. IMO this is more than an enhancement - zfs is the default storage driver for docker on systems that support it.

Smithx10 commented 6 years ago

Is mssql going to implement something similar to the innodb_flush_method?

It allows for the following types of methods to flush writes to disk: fsync O_DSYNC littlesync nosync O_DIRECT O_DIRECT_NO_FSYNC

Is there any news on this?

twright-msft commented 6 years ago

@Smithx10 - We don’t currently have any plans to support additional/configurable flush methods. Do you think we need to?

Smithx10 commented 6 years ago

@twright-msft , Is there a plan to support ZFS and other operating systems without forcing them to hack around the real problem using LD_PRELOAD?

To be honest, it would be awesome if there was native FreeBSD and Illumos support. My experience would tell me that... this probably will never happen, but hey... It's a New Microsoft right?

:(

DirectIO via the O_DIRECT flag was originally introduced in XFS by IRIX for database workloads. Its purpose was to allow the database to bypass the page and buffer caches to prevent unnecessary IO operations (e.g. readahead) while preventing contention for system memory between the database and kernel caches.

Unfortunately, the semantics were never defined in any standard. The semantics of O_DIRECT in XFS in Linux are as follows:

  1. O_DIRECT requires IOs be aligned to backing device's sector size.
  2. O_DIRECT performs unbuffered IO operations between user memory and block device (DMA when the block device is physical hardware).
  3. O_DIRECT implies O_DSYNC.
  4. O_DIRECT disables any locking that would serialize IO operations.

The first is not possible in ZFS beause there is no backing device in the general case.

The second is not possible in ZFS in the presence of compression because that prevents us from doing DMA from user pages. If we relax the requirement in the case of compression, we encunter another hurdle. In specific, avoiding the userland to kernel copy risks other userland threads modifying buffers during compression and checksum computations. For compressed data, this would cause undefined behavior while for checksums, this would imply we write incorrect checksums to disk. It would be possible to avoid those issues if we modify the page tables to make any changes by userland to memory trigger page faults and perform CoW operations. However, it is unclear if it is wise for a filesystem driver to do this.

The third is doable, but we would need to make ZIL perform indirect logging to avoid writing the data twice.

The fourth is already done for all IO in ZFS.

Other Linux filesystems such as ext4 do not follow #3. Mac OS X does not implement O_DIRECT, but it does implement F_NOCACHE, which is similiar to #2 in that it prevents new data from being cached. AIX relaxes #3 by only committing the file data to disk. Metadata updates required should the operations make the file larger are asynchronous unless O_DSYNC is specified.

On Solaris and Illumos, there is a library function called directio(3C) that allows userspace to provide a hint to the filesystem that DirectIO is useful, but the filesystem is free to ignore it. The semantics are also entirely a filesystem decision. Those that do not implement it return ENOTTY.

Given the lack of standardization and ZFS' heritage, one solution to provide compatibility with userland processes that expect DirectIO is to treat DirectIO as a hint that we ignore. This can be done trivially by implementing a shim that maps aops->direct_IO to AIO. There is also already code in ZoL for bypassing the page cache when O_DIRECT is specified, but it has been inert until now.

If it turns out that it is acceptable for a filesystem driver to interact with the page tables, the scatter-gather list work will need be finished and we would need to utilize the page tables to make operations on the userland pages safe.

References: http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s09.html https://blogs.oracle.com/roch/entry/zfs_and_directio https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics https://illumos.org/man/3c/directio https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/fcntl.2.html https://lists.apple.com/archives/filesystem-dev/2007/Sep/msg00010.html

:(

Using O_DIRECT well requires taking on a surprising amount of additional resource management responsibility. Also, the implementation is not portable and, combined with the additional responsibility previously mentioned, adds a ton of conditional compilation. The alignment requirements are not an issue at all; anyone using O_DIRECT is also going to be used properly aligned I/O buffers in any case.O_DIRECT can have a lot of benefits. However, while O_DIRECT may be simple to use by itself, the other design and implementation requirements that are indirectly dragged along are not trivial by any means. Consequently, you should not be using O_DIRECT unless you can manage the substantial indirect overhead in implementation. It turns off a lot of OS features most developers take for granted. | Using O_DIRECT well requires taking on a surprising amount of additional resource management responsibility. Also, the implementation is not portable and, combined with the additional responsibility previously mentioned, adds a ton of conditional compilation. The alignment requirements are not an issue at all; anyone using O_DIRECT is also going to be used properly aligned I/O buffers in any case.O_DIRECT can have a lot of benefits. However, while O_DIRECT may be simple to use by itself, the other design and implementation requirements that are indirectly dragged along are not trivial by any means. Consequently, you should not be using O_DIRECT unless you can manage the substantial indirect overhead in implementation. It turns off a lot of OS features most developers take for granted. Using O_DIRECT well requires taking on a surprising amount of additional resource management responsibility. Also, the implementation is not portable and, combined with the additional responsibility previously mentioned, adds a ton of conditional compilation. The alignment requirements are not an issue at all; anyone using O_DIRECT is also going to be used properly aligned I/O buffers in any case.O_DIRECT can have a lot of benefits. However, while O_DIRECT may be simple to use by itself, the other design and implementation requirements that are indirectly dragged along are not trivial by any means. Consequently, you should not be using O_DIRECT unless you can manage the substantial indirect overhead in implementation. It turns off a lot of OS features most developers take for granted.  

:(

  1. Not that it's particularly bad for PostgreSQL, but as a point of possible future interest, I found out that the Linux kernel does not (and most likely will not in the future) recommend mixing O_DIRECT and non-O_DIRECT I/O on a single file. Doing this is expected to result in very poor performance, because the use of O_DIRECT causes page cache invalidations. I haven't been able to think of a scenario in which this will actually hurt PostgreSQL users today, because we only use O_DIRECT for WAL, and then only if wal_sync_method = open_sync or open_datasync, and then only neither archiving nor streaming replication is in use. However, it's certainly worth keeping in mind if we ever consider expanding the use of O_DIRECT.

:(

Richrd Yao says: July 7, 2013 at 8:56 am Vadim, O_DIRECT was designed for in-place filesystems to allow IO to bypass the filesystem layer and caching. A literal implementation of O_DIRECT in a copy-on-write filesystem like ZFS is not possible (because checksum and parity calculations must be done). It is possible to implement it by effectively ignoring the O_DIRECT flag, but I imagine that would defeat the purpose. I imagine is the main reason the solution where O_DIRECT is ignored has not been implemented is that Linux uses a different code path for O_DIRECT and time spent implementing the separate code path is time that could be spent on other bugs.

Most of the development of ZFSOnLinux over the past two years has focused on making it ready for the first stable release. Adding O_DIRECT support did not help contribute to that, so it was a low priority. I imagine that adding O_DIRECT support would occur rather quickly if someone were to write a patch to add it that works. However, adding it would be misleading unless O_DIRECT is implemented in a way that provided some kind of tangible benefit over not using it.

Lastly, ZFSOnLinux development is done by a few professional developers at LLNL and volunteers, such as myself, that happen to use it. LLNL uses ZFSOnLinux as an OSD for the Lustre filesystem on the Sequoia suprecomputer while volunteers tend to use it on either servers or desktops. O_DIRECT is currently scheduled for a release rather far in the future because none of us have any need for O_DIRECT. It should be possible to configure your software to not use O_DIRECT, so doing it sooner does not seem like it should be a priority.