TritonDataCenter / smartos-live

For more information, please see http://smartos.org/ For any questions that aren't answered there, please join the SmartOS discussion list: https://smartos.topicbox.com/groups/smartos-discuss
1.58k stars 248 forks source link

Samba3/AD in a zone - PKCS 11 problems #383

Open stateless opened 9 years ago

stateless commented 9 years ago

See previous email at https://www.mail-archive.com/smartos-discuss@lists.smartos.org/msg00711.html.

Tried with the latest 14.3.0 base64 image:

[root@joyenttest /etc/krb5]# cat /etc/release | grep joy
                   See joyent_20141002T182809Z for assembly date and time.
[Connection to zone 'edc472ac-ab64-4a21-adac-9080b3f0ce25' pts/2 closed]
[root@node7 /usbkey/vmcfg]# imgadm list
UUID                                  NAME         VERSION    OS       PUBLISHED
b7493690-f019-4612-958b-bab5f844283e  lx-ubuntu    14.04.002  other    2014-07-23T12:00:59Z
14a960b0-614e-11e4-a095-eb789315ae39  lx-ubuntu64  14.04.003  other    2014-10-31T12:00:00Z
62f148f8-6e84-11e4-82c5-efca60348b9f  base64       14.3.0     smartos  2014-11-17T18:06:00Z

Same configuration as shown in the email, and same error:

[root@joyenttest /etc/krb5]# kinit Administrator@CORP.KPAC.CO.NZ
Password for Administrator@CORP.KPAC.CO.NZ:
kinit(v5):  no ktkt_warnd warning possible
[root@joyenttest /etc/krb5]# net ads join -U Administrator
Enter Administrator's password:
kinit succeeded but ads_sasl_spnego_krb5_bind failed: Error in the PKCS 11 library calls
Failed to join domain: failed to connect to AD: Error in the PKCS 11 library calls
[root@joyenttest /etc/krb5]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: Administrator@CORP.KPAC.CO.NZ

Valid starting                Expires                Service principal
01/07/15 20:35:38  01/08/15 06:35:56  krbtgt/CORP.KPAC.CO.NZ@CORP.KPAC.CO.NZ
        renew until 01/14/15 20:35:38
[root@joyenttest ~]# winbindd -i -d 1
winbindd version 3.6.24 started.
Copyright Andrew Tridgell and the Samba Team 1992-2011
WARNING: The "idmap uid" option is deprecated
WARNING: The "idmap gid" option is deprecated
WARNING: The "idmap uid" option is deprecated
WARNING: The "idmap gid" option is deprecated
initialize_winbindd_cache: clearing cache and re-creating with version number 2
tdbsam_open: Converting version 0.0 database to version 4.0.
tdbsam_convert_backup: updated /opt/local/etc/samba/private/passdb.tdb file.
account_policy_get: tdb_fetch_uint32 failed for type 1 (min password length), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 2 (password history), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 3 (user must logon to change password), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 4 (maximum password age), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 5 (minimum password age), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 6 (lockout duration), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 7 (reset count minutes), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 8 (bad lockout attempt), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 9 (disconnect time), returning 0
account_policy_get: tdb_fetch_uint32 failed for type 10 (refuse machine password change), returning 0
ads_krb5_mk_req: krb5_mk_req_extended failed (Error in the PKCS 11 library calls)
cli_session_setup_kerberos: spnego_gen_krb5_negTokenInit failed: Error in the PKCS 11 library calls

^CGot sig[2] terminate (is_parent=1)
Got sig[2] terminate (is_parent=0)

Note: 3.6 is EOL. See https://download.samba.org/pub/samba/rc/WHATSNEW-4.2.0rc3.txt. Maybe more use in getting samba4 to work.

nwns commented 9 years ago

In the direction of getting Samba4 working on smartos, I tried the 4.2 rc4 that was announced last week.

[root@08-00-27-ad-44-ed /scratch]# cat samba.js
{
 "brand": "joyent",
 "ram": 512,
 "image_uuid": "62f148f8-6e84-11e4-82c5-efca60348b9f",
 "delegate_dataset": true,
 "quota": 100,
 "kernel_version": "2.6.31",
 "autoboot": false,
 "nics": [
   {
     "ip": "192.168.1.22",
     "netmask": "255.255.255.0",
     "gateway": "192.168.1.254",
     "nic_tag": "admin",
     "primary": true
   }
 ],
 "resolvers": [ "8.8.8.8", "8.8.4.4" ]

}
[root@08-00-27-ad-44-ed /scratch]# vmadm create -f samba.js
Successfully created VM d65e31f6-00a4-468b-9ad4-2463ed7dda31
[root@08-00-27-ad-44-ed /scratch]# vmadm start d65e31f6-00a4-468b-9ad4-2463ed7dda31
Successfully started VM d65e31f6-00a4-468b-9ad4-2463ed7dda31
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~]# pkgin install -y build-essential gnutls pkg-config gdb popt mit-krb5 cups docbook-xsl fam py27-expat
...
pkg_install warnings: 0, errors: 0
reading local summary...
processing local summary...
updating database: 100%
marking build-essential-1.1 as non auto-removable
marking gnutls-3.2.17 as non auto-removable
marking pkg-config-0.28 as non auto-removable
marking gdb-7.6.1 as non auto-removable
marking popt-1.16nb1 as non auto-removable
marking mit-krb5-1.10.7nb3 as non auto-removable
marking cups-1.7.5 as non auto-removable
marking docbook-xsl-1.77.1nb2 as non auto-removable
marking fam-2.7.0nb9 as non auto-removable
marking py27-expat-2.7.8 as non auto-removable
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~/samba-4.2.0rc4]# zfs set mountpoint=/data zones/$(zonename)/data
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~]# wget https://download.samba.org/pub/samba/rc/samba-4.2.0rc4.tar.gz
converted 'https://download.samba.org/pub/samba/rc/samba-4.2.0rc4.tar.gz' (646) -> 'https://download.samba.org/pub/samba/rc/samba-4.2.0rc4.tar.gz' (UTF-8)
--2015-01-25 02:32:55--  https://download.samba.org/pub/samba/rc/samba-4.2.0rc4.tar.gz
Resolving download.samba.org (download.samba.org)... 216.83.154.106, 2001:470:1f05:1a07::1
Connecting to download.samba.org (download.samba.org)|216.83.154.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20655269 (20M) [application/x-gzip]
Saving to: 'samba-4.2.0rc4.tar.gz'

samba-4.2.0rc4.tar. 100%[=====================>]  19.70M  1.95MB/s   in 9.2s

2015-01-25 02:33:05 (2.15 MB/s) - 'samba-4.2.0rc4.tar.gz' saved [20655269/20655269]

[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~]# tar -xf samba-4.2.0rc4.tar.gz
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~]# pushd samba-4.2.0rc4
~/samba-4.2.0rc4 ~
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~/samba-4.2.0rc4]# ./configure --with-ads --with-acl-support --enable-fhs --prefix=/usr --sysconfdir=/etc --localstatedir=/data/state >configure.log 2>&1
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~/samba-4.2.0rc4]# echo $?
0

Two files need to be patched to get to the point that I am stuck.

Add

--- ../orig/samba-4.2.0rc4/lib/texpect/texpect.c        2014-10-01 09:17:32.000000000 +0000
+++ lib/texpect/texpect.c       2015-01-25 02:47:54.666545647 +0000
@@ -63,6 +63,8 @@
 #include <errno.h>
 #include <err.h>

+#include <signal.h>
+
 struct command {
        enum { CMD_EXPECT = 0, CMD_SEND, CMD_PASSWORD } type;
        unsigned int lineno;
--- ../orig/samba-4.2.0rc4/source3/lib/unix_msg/unix_msg.c      2014-10-01 09:17:32.000000000 +0000
+++ source3/lib/unix_msg/unix_msg.c     2015-01-25 02:59:29.958547508 +0000
@@ -505,7 +505,9 @@
         * Note: No need to check for overflow here,
         * since cmsg will store <= INT8_MAX fds.
         */
+#ifdef HAVE_STRUCT_MSGHDR_MSG_CONTROL
        msglen += cmsg_space;
+#endif

        data_len = iov_buflen(iov, iovlen);
        if (data_len == -1) {
@@ -593,7 +595,9 @@
        return 0;

 fail:
+#ifdef HAVE_STRUCT_MSGHDR_MSG_CONTROL
        close_fd_array(fds_copy, num_fds);
+#endif
        return ret;
 }
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~/samba-4.2.0rc4]# ./buildtools/bin/waf build >build.log 2>&1
[root@d65e31f6-00a4-468b-9ad4-2463ed7dda31 ~/samba-4.2.0rc4]# echo $?
1

The build fails with:

[2006/3812] Compiling lib/nss_wrapper/nss_wrapper.c
../lib/nss_wrapper/nss_wrapper.c:2405:5: error: conflicting types for 'gethostby
name_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:65:0:
/usr/include/netdb.h:238:17: note: previous declaration of 'gethostbyname_r' was
 here
../lib/nss_wrapper/nss_wrapper.c: In function 'nwrap_module_getpwnam':

Since this seems to be a known issue with Samba4, I will be taking this up with the Samba team on their bug tracker. See: http://lists.samba.org/archive/samba-technical/2014-October/102829.html

nwns commented 9 years ago

I forgot to mention, I have tried the 4.2 RC4 in the lx-ubuntu64 zone and it seems to build fine, but lx doesn't seem to have delegated datasets working, in addition to lx not being stable yet.

davefinster commented 9 years ago

FYI - there appears to be some activity regarding the nss_wrapper bug here:

https://bugzilla.samba.org/show_bug.cgi?id=10850

I attempted to use the patch, but it only made the error more problematic:

[2005/3811] Compiling lib/nss_wrapper/nss_wrapper.c
../lib/nss_wrapper/nss_wrapper.c: In function 'gethostbyname_r':
../lib/nss_wrapper/nss_wrapper.c:2439:3: warning: return makes pointer from integer without a cast [enabled by default]
../lib/nss_wrapper/nss_wrapper.c:2447:2: warning: return makes pointer from integer without a cast [enabled by default]
../lib/nss_wrapper/nss_wrapper.c: At top level:
../lib/nss_wrapper/nss_wrapper.c:3084:5: error: conflicting types for 'getpwnam_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:62:0:
/usr/include/pwd.h:131:12: note: previous declaration of 'getpwnam_r' was here
../lib/nss_wrapper/nss_wrapper.c:3150:5: error: conflicting types for 'getpwuid_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:62:0:
/usr/include/pwd.h:130:12: note: previous declaration of 'getpwuid_r' was here
../lib/nss_wrapper/nss_wrapper.c:3373:5: error: conflicting types for 'getgrnam_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:63:0:
/usr/include/grp.h:118:12: note: previous declaration of 'getgrnam_r' was here
../lib/nss_wrapper/nss_wrapper.c:3444:5: error: conflicting types for 'getgrgid_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:63:0:
/usr/include/grp.h:117:12: note: previous declaration of 'getgrgid_r' was here
../lib/nss_wrapper/nss_wrapper.c:3559:5: error: conflicting types for 'getgrent_r'
In file included from ../lib/nss_wrapper/nss_wrapper.c:63:0:
/usr/include/grp.h:59:22: note: previous declaration of 'getgrent_r' was here
davefinster commented 9 years ago

I got Samba 4.2rc4 to compile.

#define HAVE_SOLARIS_GETPWNAM_R
#define HAVE_SOLARIS_GETPWUID_R
#define HAVE_SOLARIS_GETGRENT_R
#define HAVE_SOLARIS_GETGRNAM_R
#define HAVE_SOLARIS_GETGRGID_R
#define HAVE_SOLARIS_GETPWENT_R
        msg.msg_namelen = 0;
#ifdef  HAVE_STRUCT_MSGHDR_MSG_CONTROL
        msg.msg_flags = 0;
#endif
        iov[0].iov_base = (void *)ptr;

After that - compilation finished

rmustacc commented 9 years ago

Regarding the MSGHDR_MSG_CONTROL bit, it seems like the problem here is not having the right standards settings visible. So rather than putting it under the #ifdef, seems like we should probably make sure we're compiling in the right POSIX environment. Maybe @jperkin or @mamash have more insight on that part.

davefinster commented 9 years ago

As an addition, after discussions on the mailing list it appears that running Samba 3.6.18 in a zone of version 13.3.0 ( 2013Q3 packages - UUID: 87b9f4ac-5385-11e3-a304-fb868b82fe10) works for joining an AD domain.

Attempting to build Samba 3.6.18 on a version 14.3.0 zone ( 2014Q3 packages - UUID: 62f148f8-6e84-11e4-82c5-efca60348b9f) resulted in the same error.

There have been mentions on other platforms that this may be the result of a change in Kerberos library, but considering the difference between the two platforms is 1.10.6 and 1.10.7 that probably isn't the case here. I'm thinking either the build scripts in pkgsrc having changed, different versions of another dep or something else in the image that got updated is having an effect.

davefinster commented 9 years ago

I think I've narrowed the problem down somewhat. It appears related to the 14.3 version of Kerberos installed in /usr/lib. According to truss, 'net ads join ....' attempts to load a file called libkrb5.so.1

This file doesn't exist in /opt/local/lib in the image template, which is its first port of call when its searching so it will typically pull from /usr/lib instead. I compiled mit-krb5-1.10.6 manually and installed it into /opt/local and symlink'd /opt/local/lib/libkrb5.so -> /opt/local/lib/libkrb5.so.1. Once I did that, I was able to successfully join an AD domain without any errors.

Still some experimenting to do with the version of Kerberos that ships with the image (1.10.7) and to test other functions (as I noticed it loads libgss in the same manner and probably others), but at least its something to go on.

davefinster commented 9 years ago

Created a completely new 14.3.0 SmartMachine zone, installed

mit-krb5-1.10.7nb3 samba-3.6.24

unaltered from pkgsrc. Set up the config files/host/resolv.conf entries as per normal and then performed the symlink as mentioned above and everything works.

In the mailing list entries, there is also mention of a error in copying libnss_winbind.so into /usr/lib, which is understandable given its a read-only file system. Anything that would otherwise be using nsswitch.conf to determine usernames/groups is going to have problems as a result. I got around this by copying 32 and 64 bit .so files into /usr/local/lib and /usr/local/lib/64 (doesn't exist normally). I then updated the dynamic loader search paths with:

crle -c /var/ld/ld.config -l /lib:/usr/lib:/usr/local/lib -s /lib/secure:/usr/lib/secure
crle -64 -c /var/ld/64/ld.config -l /lib/64:/usr/lib/64:/usr/local/lib/64 -s /lib/secure/64:/usr/lib/secure/64

This allows programs like getent (which is 32-bit) and id (64-bit) to locate the required library. The only other thing I had to so was build a SMF manifest for winbindd (as it doesn't come with one from pkgsrc).

Seems there is something off about the kerberos libraries in /usr/lib. Not entirely sure what though...

jperkin commented 9 years ago

Yeh, we changed the default mit-krb5 to use the platform version between those releases. As that appears to be causing problems I'll change it back for 2014Q4 to use mit-krb5 from pkgsrc, and will have some packages for you to test shortly.

Could you submit your SMF manifests so we can include them too? Thanks.

davefinster commented 9 years ago

@jperkin I've opened pull request 243 on the joyent/pkgsrc repo with the changes I've made + updated SMF. Happy to test out newly built packages.

stateless commented 9 years ago

Dave, Thanks for doing all this debug work. :D

xmerlin commented 9 years ago

@davefinster could you give me more details about the crle "hack"? ...I've done the same but getent and id doesn't work as expected.

I've tryed also crle -c /var/ld/ld.config -l /lib:/usr/lib:/usr/local/lib:/opt/local/lib/ -s /lib/secure:/usr/lib/secure:/opt/local/lib/security and crle -64 -c /var/ld/64/ld.config -l /lib/64:/usr/lib/64:/usr/local/lib/64:/opt/local/lib -s /lib/secure/64:/usr/lib/secure/64:/opt/local/lib/security

without success FYI: wbinfo is ok

davefinster commented 9 years ago

@xmerlin This can also depend on your winbind settings - I've got UNIX extensions installed in my AD and have manually set a UID/GID for certain users within AD. That results in a 1:1 mapping once winbind gets involved. wbinfo for me, despite only having a few users setup, shows everyone in AD whereas getent and id only work for those with UID/GID.

I might not have made it very clear in my original description, but I had to unpack the 32-bit version of libnss_winbind.so from the 32-bit package in pkgsrc manually and place that in the search path for the 32-bit dynamic loader (that I altered with crle). I also used the 64-bit libnss_winbind.so where required.

Problem is that getent is 32-bit whereas id is 64-bit. I do also notice that on occasion, calling getent passwd results in the command waiting (seemingly for data from winbind) that will occasionally time out. But if you run it immediately after, the users show up. id on the other hand appears to work all the time.

Maybe its worth running getent passwd through truss to see if the right libnss_winbind.so is getting used? Thats how I discovered whats going on.

The other thing I noticed was that SmartOS did not like it when I didn't manually set a different separator for winbind and it defaulted to the AD . No users came through.

My smb.conf

[global]
        workgroup = DOMAIN
        realm = DOMAIN.LOCAL
        server string = Samba %v (%h)
        interfaces = net*, lo
        bind interfaces only = Yes
        security = ADS
        password server = the-pdc.domain.local
        map untrusted to domain = yes
        log file = /var/log/log.%m
        load printers = no
        domain master = no
        winbind enum users = yes
        winbind enum groups = yes
        winbind separator = +
        winbind nss info = rfc2307
        idmap config * : backend = tdb
        idmap config * : range = 1001-2000
        idmap config DOMAIN : backend = ad
        idmap config DOMAIN : range = 10000-20000
        idmap config DOMAIN : schema_mode = rfc2307
        map acl inherit = yes
        winbind nested groups = yes
        inherit acls = yes
        acl group control = yes
gzartman commented 9 years ago

Jperkins,

Where might we pull updated packages? I'm keen on testing as well.

Thanks,

Greg

xmerlin commented 9 years ago

@davefinster I've upacked the 32bit version of libnss_winbind.so and I've putted it in /opt/local/lib/ but getent doesn't give me the correct result.

I've used truss to print the libraries loaded but getent and there is no trace of libnss_winbind

in nsswitch.conf I've:

passwd: files winbind group: files winbind

I've tried also with ad instead of winbind ...same result

thanks

davefinster commented 9 years ago

@xmerlin Ah I forgot to mention that the symlink/copied file had to be renamed to nss_winbind.so.1 for it to be picked up. That is how the other databases (such as nss_files.so.1 etc are named).

My bad!

davefinster commented 9 years ago

I think we can safely close this issue - I've just tested it with base-64-lts 14.4.0 and everything works as expected.

As an FYI, I still haven't been able to get 'getent group' to show anything from AD, but 'getent group 10001' (where 10001 is a AD group) does work.

nwns commented 9 years ago

That is good to hear. I need to come back to this.

The getent group doesn't show the AD groups because your winbind is not configured to enumerate groups by default for performance reasons. I believe the smb.conf option is winbind enum groups.

davefinster commented 9 years ago

@nwns I do actually have that entry in my smb.conf. Full scrubbed file is below:

PS: getent passwd works fine, just getent group doesn't.

[global]
        workgroup = DOMAIN
        realm = DOMAIN.LOCAL
        server string = Samba %v (%h)
        interfaces = net*, lo
        bind interfaces only = Yes
        security = ADS
        password server = some-server.domain.local
        map untrusted to domain = yes
        log file = /var/log/log.%m
        log level = 2
        load printers = no
        domain master = no
        winbind enum users = yes
        winbind enum groups = yes
        winbind nss info = rfc2307
        idmap config * : backend = tdb
        idmap config * : range = 1001-2000
        idmap config SEYMOURWHYTE : backend = ad
        idmap config SEYMOURWHYTE : range = 10000-20000
        idmap config SEYMOURWHYTE : schema_mode = rfc2307
        map acl inherit = yes
        winbind nested groups = yes
        inherit acls = yes
        acl group control = yes

[Media]
        path=/zones/3352fae6-075a-4723-9a09-6f45d78f7e30/data/Media
        read only = no
        writable = yes
        browseable = yes
        valid users = "DOMAIN\domain users"
        write list = "DOMAIN\domain users"
        create mask = 0775
        directory mask = 0775
        file acls = yes
        nt acl support = yes
        inherit acls = yes
        map acl inherit = yes
        store dos attributes = yes
        map archive = no
        map readonly = no
        vfs objects = zfsacl
        posix locking = yes
        strict locking = no
        inherit owner = Yes
        nfs4:mode = special
        nfs4:acedup = merge
        nfs4:chown = yes
nwns commented 9 years ago

I assume you have (re)started the services after it was changed (make sure that winbind actually exits, when I was doing this on linux it would take a while sometimes).

Assuming the users and at least one group had the rfc attributes added and within the range 10000-20000 when the service started (which it should, if the look up by number works).

You will need to try the Samba mailing list if that doesn't work.

davefinster commented 9 years ago

@nwns Certainly have started them. I've done my own SMF for winbindd so it is certainly running.

getent group 10001 does work, so it is resolving. It just seems to be around getting the full list. Permissions/group name resolution work fine as I can use chgrp 10001 and it gets the correct name.

Good idea regarding the Samba mailing list though - I'll drop a line there.

davefinster commented 9 years ago

@nwns My bad - not all our AD groups had UNIX group numbers. 'getent groups' works, but explodes off the screen.

nwns commented 9 years ago

Interesting, I would not have expected that all groups need an ID number for the enumeration to work for at all.

davefinster commented 9 years ago

@nwns It seemed that whenever I called 'getent group', winbind logs would always contain a single entry about failing to resolve a particular SID. That SID would always be a group and the SID/group order matched the output of 'wbinfo -g'. Sure enough as I added the attributes to the groups the error moved onto the next one.

Once all the groups had numbers, they all came through at once.

Worth noting that getent passwd doesn't exhibit this behaviour despite the occasional similar SID resolution error in winbind.