Open xrow opened 1 year ago
I ran an strace with mcr.microsoft.com/mssql/server:2019-CU4-ubuntu-18.04.
I currently belive it is an selinux issue. Does mssql only support selinux being off? Is there a way to fix this?
uname({sysname="Linux", nodename="mssqlinst", ...}) = 0
openat(AT_FDCWD, "/var/opt/mssql/.system//instance_id", O_RDWR|O_CREAT|O_APPEND, 0666) = 14
fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
lstat("/opt", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("/opt/mssql", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("/opt/mssql/lib", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
lstat("/opt/mssql/lib/system", 0x7ffeb4258430) = -1 ENOENT (No such file or directory)
gettid() = 640
gettid() = 640
getcpu([3], [0], NULL) = 0
openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 15
read(15, "sT\36Y\224\240Q\332", 8) = 8
pread64(7, "MZ\220\0\3\0\0\0\4\0\0\0\377\377\0\0\270\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 64, 3043328) = 64
pread64(7, "PE\0\0d\206\t\0\2435}^\0\0\0\0\0\0\0\0\360\0\" ", 24, 3043568) = 24
pread64(7, "\v\2\16\0\0\20\30\0\0\320\24\0\0\0\0\0\200c7\0\0\0 \0\0\0@j\0\0\0\0"..., 240, 3043592) = 240
mmap(0x6aa00000, 20971520, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x6aa00000
mmap(0x6aa00000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x2e7000) = 0x6aa00000
mmap(0x6ac00000, 1576960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x2e8000) = 0x6ac00000
mmap(0x6ae00000, 557056, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x469000) = 0x6ae00000
mmap(0x6b000000, 241664, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x4f1000) = 0x6b000000
mmap(0x6b200000, 57344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x52c000) = 0x6b200000
mmap(0x6b400000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53a000) = 0x6b400000
mmap(0x6b600000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53b000) = 0x6b600000
mmap(0x6b800000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53c000) = 0x6b800000
mmap(0x6ba00000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53d000) = 0x6ba00000
mmap(0x6bc00000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53e000) = 0x6bc00000
mprotect(0x6aa00000, 4096, PROT_READ) = 0
mprotect(0x6ac00000, 2097152, PROT_READ|PROT_EXEC) = -1 EACCES (Permission denied)
gettid() = 640
gettid() = 640
getcpu([7], [0], NULL) = 0
futex(0x7ff504602008, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0x7ff5024e51a0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
write(2, "/opt/mssql/bin/sqlservr: Unable "..., 71/opt/mssql/bin/sqlservr: Unable to start the process, with error 101.
) = 71
futex(0x7ff4fc3b7948, FUTEX_WAKE_PRIVATE, 2147483647) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
exit_group(1) = ?
+++ exited with 1 +++
After having selinux off. I was able to start the container. Is there a better way to solve this. I hope would have been the selinux would not matter for mssql.
Found what appears to be the same issue.
When Selinux is enforcing, if I run the container with the --privileged
flag (but on a non-root user), it fails with that permission error.
podman run --privileged -e ACCEPT_EULA="Y" -e MSSQL_PID="Developer" mcr.microsoft.com/mssql/server:2022-latest
If I run it without the --privileged
flag, it runs fine. If I set selinux to permissive, it runs fine as well regardless of the --privileged
flag.
The SELinux error is as follow (removed some parts for brevity):
SELinux is preventing /opt/mssql/bin/sqlservr from execmod access on the file /opt/mssql/lib/system.sfp. For complete SELinux messages run: sealert -l...
SELinux is preventing /opt/mssql/bin/sqlservr from execmod access on the file /opt/mssql/lib/system.sfp.
***** Plugin restorecon (84.5 confidence) suggests ************************
If you want to fix the label.
/opt/mssql/lib/system.sfp default label should be lib_t.
Then you can run restorecon. The access attempt may have been stopped due to insufficient permissions to access a par...
Do
# /sbin/restorecon -v /opt/mssql/lib/system.sfp
***** Plugin allow_execmod (8.90 confidence) suggests *********************
If this issue occurred during normal system operation.
Then this alert could be a serious issue and your system could be compromised. Setroubleshoot examined '/opt/mssql/li...
Do
If you want to allow selinuxuser to execmod
Then you must tell SELinux about this by enabling the 'selinuxuser_execmod' boolean.
Do
setsebool -P selinuxuser_execmod 1
***** Plugin catchall (1.34 confidence) suggests **************************
If you believe that sqlservr should be allowed execmod access on the system.sfp file by default.
Then you should report this as a bug.
You can generate a local policy module to allow this access.
Do
allow this access for now by executing:
# ausearch -c 'sqlservr' --raw | audit2allow -M my-sqlservr
# semodule -X 300 -i my-sqlservr.pp
I enabled the recommended boolean and the problem got fixed: setsebool -P selinuxuser_execmod 1
.
The best solution may be the relabeling of /opt/mssql/lib/system.sfp
to lib_t
.
@carofe82 Awesome... I am not so good with chcon... Do you have a command I could add to my Containerfile of the mssql to fix it?
Is it as simple as chcon -R -t lib_t /opt/mssql/lib/system.sfp
?
I think selinuxuser_execmod can`t be enabled via a Kubernetes POD.
@carofe82 Awesome... I am not so good with chcon... Do you have a command I could add to my Containerfile of the mssql to fix it?
Is it as simple as
chcon -R -t lib_t /opt/mssql/lib/system.sfp
?I think selinuxuser_execmod can`t be enabled via a Kubernetes POD.
What I suggested won't work. The containers are supposed to get a label at runtime when they run. The privileged ones get a specific unconfined_u type instead at runtime. The files all get that one container_file_t label, with additional mcs suffixes. My experience with selinux has always been outside containers. I tried to reproduce it at home with a rhel9 but everything worked fine. I'll keep digging to see what I can find.
Have you tried running the container with privileged: false
?
The privileged switch had no effect....
The privileged switch had no effect....
I think this is revealing a security bug in MSSQL.
This is what you found:
mmap(0x6bc00000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 7, 0x53e000) = 0x6bc00000
mprotect(0x6aa00000, 4096, PROT_READ) = 0
mprotect(0x6ac00000, 2097152, PROT_READ|PROT_EXEC) = -1 EACCES (Permission denied)
Notice how the mapped memory is opened for writing, and then it is changed to allow "EXEC". That's exactly what SELinux is preventing because it is insecure:
https://akkadia.org/drepper/selinux-mem.html (see the execmod section)
https://akkadia.org/drepper/textrelocs.html (the text relocation problem)
So it looks like it is something that mssql needs to fix. The safe workaround seems to be to generate a specific policy for this container to allow execmod only on this one. I'll look into it.
Related change in Enterprise Linux 9:
https://bugzilla.redhat.com/show_bug.cgi?id=2055822
Default SELinux policy disallows commands with text relocation libraries
The
selinuxuser_execmod
boolean is now off by default to improve the security footprint of installed systems. As a result, SELinux users cannot enter commands using libraries that require text relocation, unless the library files have thetextrel_shlib_t
label.
They also changed selinuxuser_execstack
to be off
by default in the same issue. However, Redhat reverted selinux_execstack
back to on
because of issues with vmware: https://bugzilla.redhat.com/show_bug.cgi?id=2064274
But selinuxuser_execmod
remained off
by default.
Microsoft will need to patch its product to make it work with EL9. Meanwhile, it seems that the fix is to enable the boolean in the host OS where you container runs: setsebool -P selinuxuser_execmod 1
@xrow.
Wow nice find. Do you know how to report to microsoft?
Thank you @xrow for bringing this issue to our attention and @carofe82 for the great analysis. We are currently working on support for SQL Server on Linux for RHEL 9. Our packages for SQL Server 2019 and 2022 on RedHat have currently only been certified for RHEL 8.x - https://learn.microsoft.com/sql/linux/sql-server-linux-setup?view=sql-server-ver16#supportedplatforms and not RHEL 9.
We do not have any timelines that we can share at the moment regarding when support for RHEL 9 will be available, however, we are actively working on supporting that platform.
If you believe you have found a security vulnerability that meets Microsoft's definition of a security vulnerability (https://www.microsoft.com/msrc/definition-of-a-security-vulnerability), please report the issue following the FAQ on our Microsoft Security Response Center https://www.microsoft.com/en-us/msrc/faqs-report-an-issue site.
For one of the original issues in this thread, there was a reference to a container running Ubuntu 18.04 and 20.04 for those we would encourage that you create an Azure Ideas item for this issue at https://aka.ms/sqlfeedback, or if this is something that requires technical support for Ubuntu 18.04/20.04, or RHEL 8 based deployments (enabling SELinux as an example), please refer to our https://learn.microsoft.com/sql/sql-server/sql-server-get-help documentation for assistance.
Thanks @thesqlsith just to be correct: The host is a centos 9 and the mssql is in docker. I think we shall wait till the rhel support is out for sql server. Also it seems we have a workaround. Is it enough for the issue stays here being processed by the sql server guys? It looks like it doesn`t belong to azure nor the sql server security.
Sure, no problem. However, I would still encourage you to open a feedback item on https://aka.ms/sqlfeedback. We have a category specifically for SQL Server on Linux at that site. The site used to be called Connect several years ago, followed by User Voice. The latest incarnation of it is called Azure Ideas, but that is just the name of the overall feedback system. There are quite a few non-Azure related topics there. Suggestions, bug reports, product requests, etc. can be filed there and the appropriate teams on our side can engage.
Think of it as another way to reach out to Microsoft in conjunction with logging a GitHub issue.
Thank you @xrow for providing this feedback. We have released a RHEL based container image which has SELinux support by way of a new mssql-server-selinux package. Let us know your experience with the preview package - https://techcommunity.microsoft.com/t5/sql-server-blog/sql-server-2022-now-available-for-both-rhel-9-and-ubuntu-22-04/ba-p/3896410
@thesqlsith
Thanks I am testing right now. I have noticed the documentation needs more love it is install instructions for RHEL 9.
@thesqlsith sorry... invalid feedback... deleted
@thesqlsith I have a wish... https://learn.microsoft.com/en-us/sql/linux/sql-server-linux-configure-environment-variables?view=sql-server-linux-ver16 documents env vars. Though they are not available for a custom entrypoint script. Can we introduce a ENV vars in the contaainerfile definition like ENV MSSQL_LOG_DIR=/var/opt/mssql/data to know where we can write logs.
Also a WORKDIR is not set. Though I do not know what would be a good one. I think anything is better then / maybe /var/opt/mssql.
I also did set:
ENV PATH=${PATH}:/opt/mssql/bin:/opt/mssql-tools/bin
I was also not able to find the original Contailerfile in git.
@thesqlsith
To let you know I did put the image in here as default: https://artifacthub.io/packages/helm/mssql/mssql
I'm testing the preview image in our environment. It seems to be working fine. I tested it with the selinuxuser_execmod
boolean off.
Hello,
I had been doing test with the image mssql/server:2022-latest. I was not able to start the image on my rke2 cluster with a centos 9 os. Since I had done successfull tests with podman, I tought this is clearly a permission issue. Though I did try to grant the pod as many permissions as possible, but the error persists.
The log highlights an error "Found less than 2 threads", but this seems unrelated.
I looked through the internet, but noone seem to have the same error I have and the error "RETAIL ASSERT: Expression=(NT_SUCCESS(status)) File=drtl.cpp Line=1550" doesn`t help me. I am stuck somehow.
My pod spec looks like this: