Azure / ACS

Azure Container Service - Bug Tracker + Announcements
65 stars 27 forks source link

SQL Server Linux Container running as StatefulSet shows status as Recovery Pending #54

Open Mahesh-MSFT opened 7 years ago

Mahesh-MSFT commented 7 years ago

Is this a request for help?:

YES

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

BUG REPORT

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}

What happened: I created a custom SQL Server Linux container by following process as described below.

  1. Create a standard SQL Sever Container by running docker run -d -p 1433:1433 -e "SA_PASSWORD=<password>" -e "ACCEPT_EULA=Y" microsoft/mssql-server-linux
  2. Connect to this container on 127.0.0.1, 1433 from SQL Server Management Studio and create a custom database by running following script
    
    GO
    IF(db_id(N'snoopyshoppingcart') IS NULL)
    BEGIN
    CREATE DATABASE snoopyshoppingcart 
        ON
            (
                NAME = ssc_dat,
                FILENAME = N'/ssc/ssc.mdf'
            )
        LOG ON  
            ( 
                NAME = ssc_log,  
                FILENAME = N'/ssc/ssc.ldf'
            )
    END;

GO USE snoopyshoppingcart

GO IF NOT EXISTS (SELECT * FROM sysobjects WHERE NAME='shopping' AND XTYPE='U') BEGIN CREATE TABLE shopping ( AddedOn datetime, ConnectionID nvarchar(100), IP nvarchar(20), CartItem nvarchar(100) ) END

Note that database files are created on path `/ssc`. This is a deviation from standard SQL Server files path `/var/opt/mssql/data`. I want to use `/ssc` - custom path to use as mountpoint and mount it on Azure Disk when creating a statefulset for this container.
3. Run `SHUTDOWN WITH NOWAIT`.
4. Stop SQL Server Container.
5. Commit changes by running` docker commit 8f maksh/snoopyshoppingcartdb` and creating a new image.
6. Push this custom image by running `docker push maksh/snoopyshoppingcartdb`

When I create a container from this custom image and run _locally_ (127.0.0.1, 1433), I can see that my custom database is available.

However, when I run this container as a statefulset on Kubernetes in Azure. Database shows status as **Recovery Pending**.
My k8s manifests below.

Secret (For sa password):

apiVersion: v1 kind: Secret metadata: name: sqlsecret type: Opaque data: sapassword: UGFzc3dvcmQxMjM0

Storage Class:

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: azurestorageclass provisioner: kubernetes.io/azure-disk parameters: skuName: Standard_LRS location: southeastasia storageAccount:

Service:

apiVersion: v1 kind: Service metadata: name: sqlservice labels: app: sqlservice spec: type: LoadBalancer ports:

2017-08-14 07:20:48.76 spid24s     Starting up database 'snoopyshoppingcart'.

2017-08-14 07:20:48.77 spid24s     Error: 17204, Severity: 16, State: 1.

2017-08-14 07:20:48.77 spid24s     FCB::Open failed: Could not open file /ssc/ssc.mdf for file number 1.  OS error: 2(The system cannot find the file specified.).

2017-08-14 07:20:48.78 spid9s      The resource database build version is 14.00.800. This is an informational message only. No user action is required.

2017-08-14 07:20:48.79 spid24s     Error: 5120, Severity: 16, State: 101.

2017-08-14 07:20:48.79 spid24s     Unable to open the physical file "/ssc/ssc.mdf". Operating system error 2: "2(The system cannot find the file specified.)".

2017-08-14 07:20:48.81 spid24s     Error: 17207, Severity: 16, State: 1.

2017-08-14 07:20:48.81 spid24s     FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file '/ssc/ssc.ldf'. Diagnose and correct the operating system error, and retry the operation.

2017-08-14 07:20:48.81 spid9s      Starting up database 'model'.

2017-08-14 07:20:48.83 spid24s     File activation failure. The physical file name "/ssc/ssc.ldf" may be incorrect.

When I connect with SQL Server using external IP of service, I see my custom database is shown with status Recovery Pending

What you expected to happen: I expect custom SQL Server database should be mounted on Azure Disk and in operational state.

How to reproduce it (as minimally and precisely as possible): Follow steps as mentioned above.

Anything else we need to know: A VHD disk gets created in storage account. This matches persistentvolumeclaim.

twright-msft commented 7 years ago

This is probably the NFS/remote storage problem that is present in RC1 and RC2 images. To confirm can you please pull/run the image with the :ctp2-1 tag? The bug was not present in that release. This issue is fixed in development now and will be part of the next release after RC2.

Mahesh-MSFT commented 7 years ago

Created new image from microsoft/mssql-server-linux:ctp2-1 base image. Same results :-( Database still shows "Recovery Pending" status.

Also tried with following another approach.

Dockerfile:

FROM microsoft/mssql-server-linux:ctp2-1

# Create app directory
RUN mkdir -p /ssc
WORKDIR /ssc

# Bundle app source
COPY . /ssc

# Grant permissions for the import-data script to be executable
RUN chmod +x /ssc/setup.sh

CMD /bin/bash /ssc/entrypoint.sh

Setup.sh:

#wait for the SQL Server to come up
sleep 90s

#run the setup script to create the DB and the schema in the DB
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P <my-password> -d master -i setup.sql

Setup.sql:

GO
IF(db_id(N'snoopyshoppingcart') IS NULL)
BEGIN
    CREATE DATABASE snoopyshoppingcart 
        ON
            (
                NAME = ssc_dat,
                FILENAME = N'/ssc/ssc.mdf'
            )
        LOG ON  
            ( 
                NAME = ssc_log,  
                FILENAME = N'/ssc/ssc.ldf'
            )
END;

GO
USE snoopyshoppingcart

GO
IF NOT EXISTS (SELECT * FROM sysobjects WHERE NAME='shopping' AND XTYPE='U')
BEGIN
    CREATE TABLE shopping
    (
        AddedOn datetime,
        ConnectionID nvarchar(100),
        IP nvarchar(20),
        CartItem nvarchar(100)
    )
END

entrypoint.ssh:

#start SQL Server, start the script to create the DB
/opt/mssql/bin/sqlservr & /ssc/setup.sh

Docker logs shows following error.

2017-08-15 05:05:20.46 spid52      Starting up database 'snoopyshoppingcart'.
2017-08-15 05:05:20.55 spid52      Parallel redo is started for database 'snoopyshoppingcart' with worker pool size [1].
2017-08-15 05:05:20.57 spid52      Parallel redo is shutdown for database 'snoopyshoppingcart' with worker pool size [1].
Changed database context to 'snoopyshoppingcart'.

After this error, container becomes unresponsive.

windswordsgd commented 7 years ago

It looks you placed ssc.mdf, ssc.ldf to directory /ssc in the container image. It is working fine in local machine as there is no data disk attached to it. Then when you create a POD with a persistent volume, the volume /dev/sdc is mounted to directory (mountpoint) /ssc and it does not have the file ssc.mdf or ssc.ldf.

For instance, a new mountpoint /mntdavid, when you write files into it, the file “davidtestabcd20170815” is saved on OS disk /dev/sda1.

root@davidubuntu1604lts:/mntdavid# ls -la total 815672 drwxr-xr-x 2 root root 4096 Aug 15 05:03 . drwxr-xr-x 24 root root 4096 May 18 05:11 .. -rw-r--r-- 1 root root 835232768 Aug 15 05:04 davidtestabcd20170815 root@davidubuntu1604lts:/mntdavid# df -k Filesystem 1K-blocks Used Available Use% Mounted on udev 1740984 0 1740984 0% /dev tmpfs 352176 36220 315956 11% /run /dev/sda1 30428648 8283944 22128320 28% / tmpfs 1760860 0 1760860 0% /dev/shm tmpfs 5120 0 5120 0% /run/lock tmpfs 1760860 0 1760860 0% /sys/fs/cgroup tmpfs 352176 0 352176 0% /run/user/0 tmpfs 352176 0 352176 0% /run/user/999 /dev/sdb1 7092664 16120 6693216 1% /mnt tmpfs 352176 0 352176 0% /run/user/1000

If the mountpoint is mounted with a data disk (for instance /dev/sdc1), it will only show the contents of /dev/sdc1 and you will not be able to see the files from /dev/sda1.

root@davidubuntu1604lts:/mntdavid# mount /dev/sdc1 /mntdavid root@davidubuntu1604lts:/mntdavid# ls -la total 116 drwxr-xr-x 23 root root 4096 May 17 23:58 . drwxr-xr-x 24 root root 4096 May 18 05:11 .. drwxr-xr-x 2 root root 4096 May 17 23:54 bin drwxr-xr-x 3 root root 4096 May 17 23:58 boot drwxr-xr-x 5 root root 4096 Jun 27 2016 dev drwxr-xr-x 99 root root 4096 May 18 06:55 etc drwxr-xr-x 3 root root 4096 May 17 11:05 home lrwxrwxrwx 1 root root 32 May 17 23:58 initrd.img -> boot/initrd.img-4.4.0-78-generic lrwxrwxrwx 1 root root 32 Jun 27 2016 initrd.img.old -> boot/initrd.img-4.4.0-28-generic root@davidubuntu1604lts:/mntdavid# ls -la | grep -i davidtestabcd20170815

Is there a way to set a default directory as an environmental value for SQLonLinux instead of placing any data to it?

chzbrgr71 commented 7 years ago

FWIW, I have a SQL Linux container running on ACS and I used the default directory as below.

    volumeMounts:
    - mountPath: /var/opt/mssql
      name: sqldb-home
Mahesh-MSFT commented 7 years ago

@chzbrgr71 can you share your k8s manifests/yaml?

chzbrgr71 commented 7 years ago

I have it posted here: https://github.com/chzbrgr71/sql-guestbook/blob/master/kube-db.yaml

Mahesh-MSFT commented 7 years ago

Thanks @chzbrgr71. I see that you are using base image microsoft/mssql-server-linux to create container and then run a script to create custom database.

In my case, I have a custom database in custom database image.

@twright-msft Do you think Dockerfilefor my custom database is correct?

twright-msft commented 7 years ago

The only thing about putting your database in the image is that it is ephemeral. Any changes you make to the data will be lost if the container is deleted. For example, when you want to upgrade the SQL Server binaries you will need to deploy a new container. That may be fine in some scenarios like for dev/test purposes if that is what you are using it for. Otherwise, you'll want to do something more like what @chzbrgr71 has done.

Mahesh-MSFT commented 7 years ago

Thats why I am using statefulset with persistent volume.

Even if I do not put my database as part of image and try to create it using db creation script at container startup, it is failing.

I am simply trying to run db creation script at container start up - by running entrypoint.ssh -(@chzbrgr71 is running his script manually!).

Is this not valid approach?

chzbrgr71 commented 7 years ago

My planned approach for this is to use a Kubernetes initContainer. This would allow for a .sql script to be run to pre-stage the database and any seed data. Working on a sample here: https://github.com/chzbrgr71/guestbook-web/blob/master/sql-initcontainer/db.yaml

Still just a draft, but I see this working well. Need another initContainer to wait for SQL Server to start as well.

twright-msft commented 7 years ago

If you are trying to use remote storage for this you are hitting the RC1/RC2 bug that prevents DBs from being created on NFS/Samba shares. Only local storage will work for now. This is fixed already and will be in the next release. Until then you can go back to using the :CTP1-4 tagged image which doesnt have this bug.