twosigma / Cook

Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Apache License 2.0
338 stars 63 forks source link

Kerberos documentation #172

Closed m4ce closed 6 years ago

m4ce commented 8 years ago

Hi,

I'm trying to set up cook using kerberos authentication. Is there any way to specify the kerberos principal and keytab for the cook scheduler?

Cheers, Matteo

wyegelwel commented 8 years ago

I believe you will need to set the environment variables KRB5_KTNAME (for keytab) and KRB5CCNAME (for principal) before launching Cook. You will also need to set the key :authorization to {:kerberos true} in the config file.

Please let me know if this works, I want to make sure you are able to get Cook running.

m4ce commented 8 years ago

Hi @wyegelwel,

sorry for the late reply but I could only look at this today.

So, after some fiddling, I still cannot get this to work :(

I generated a headless keytab for the cook scheduler and exported the env variables before running cook. I also set kerberos true in the config.edn.

[cook@test log]# export KRB5_PRINCIPAL="HTTP/test.example.org@EXAMPLE.ORG"
[cook@test log]# export KRB5_KTNAME=/home/cook/http.headless.keytab
[cook@test log]#  export KRB5CCNAME=/tmp/cook.krb5_ccache
[cook@test log]# /usr/bin/kinit -k -t $KRB5_KTNAME $KRB5_PRINCIPAL
[cook@test log]# klist
Ticket cache: FILE:/tmp/cook.krb5_ccache
Default principal: HTTP/test.example.org@EXAMPLE.ORG

Valid starting     Expires            Service principal
09/04/16 09:37:50  09/05/16 09:37:50  krbtgt/EXAMPLE.ORG@EXAMPLE.ORG
[cook@test log]# klist -kte /home/cook/http.headless.keytab
Keytab name: FILE:/home/cook/http.headless.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   3 09/04/16 09:36:42 HTTP/test.example.org@EXAMPLE.ORG (aes256-cts-hmac-sha1-96)
   3 09/04/16 09:36:42 HTTP/test.example.org@EXAMPLE.ORG (aes128-cts-hmac-sha1-96)
   3 09/04/16 09:36:42 HTTP/test.example.org@EXAMPLE.ORG (des3-cbc-sha1)
   3 09/04/16 09:36:42 HTTP/test.example.org@EXAMPLE.ORG (arcfour-hmac)

on the client:

[mace@test]~% klist
Ticket cache: KEYRING:persistent:1088600001:krb_ccache_5m5m6V6
Default principal: mace@EXAMPLE.ORG

Valid starting       Expires              Service principal
09/04/2016 09:47:16  09/05/2016 09:47:16  krbtgt/EXAMPLE.ORG@EXAMPLE.ORG

Run the job like:

curl -v --negotiate -u : -H "content-type: application/json" -XPOST http://test.example.org:12321/rawscheduler -d '{"jobs": [{"max_retries": 1, "max_runtime": 86400000, "mem": 1000, "cpus": 1.5, "uuid": "26719da8-394f-44f9-9e6d-8a17500f5112", "command": "id", "container": {"type": "DOCKER", "docker": {"image": "centos:latest", "network": "HOST", "parameters": [{"key": "user", "value": "1013"}]}}}]}'

I get error 500:

* About to connect() to test.example.org port 12321 (#0)
*   Trying 10.0.0.1...
* Connected to test.example.org (10.0.0.1) port 12321 (#0)
> POST /rawscheduler HTTP/1.1
> User-Agent: curl/7.29.0
> Host: test.example.org:12321
> Accept: */*
> content-type: application/json
> Content-Length: 290
>
* upload completely sent off: 290 out of 290 bytes
< HTTP/1.1 401 Unauthorized
< Content-Type: text/html
< WWW-Authenticate: Negotiate
< Cache-Control: max-age=0
< Transfer-Encoding: chunked
< Server: Jetty(9.2.z-SNAPSHOT)
<
* Ignoring the response-body
* Connection #0 to host test.example.org left intact
* Issue another request to this URL: 'http://test.example.org:12321/rawscheduler'
* Found bundle for host test.example.org: 0xe4e010
* Re-using existing connection! (#0) with host test.example.org
* Connected to test.example.org (10.183.18.51) port 12321 (#0)
* Server auth using GSS-Negotiate with user ''
> POST /rawscheduler HTTP/1.1
> Authorization: Negotiate YIICVwYJKoZIhvcSAQICAQBuggJGMIICQqADAgEFoQMCAQ6iBwMFACAAAACjggFbYYIBVzCCAVOgAwIBBaENGwtTUlYuUktSLkhLR6InMCWgAwIBA6EeMBwbBEhUVFAbFGhrZ2RldjAxLnJuZC5ya3IuaGtno4IBEjCCAQ6gAwIBEqEDAgEDooIBAASB/QtX/fSdfClfSfZIEf1BmaRWn+vFIyvFg4NmkQ9vdMn1gN29FGxu6zarzyRTz3+qOP6eKTrgfqH+fz+OYm4GeZ1VAxIjTg4HuPcFm2ykJoLGAsiWQGfu6MCZO2ndppMbndcPKNJuDcqVw7eaLeBOgalNj4Em/juPtTHhHT31b0oOl5TQeSJAhT8ZZJU6jHdDJIE0uaYkcaM12kKuJeF0dc4C+YRELiQnwfl52UfqpjFHqpjYloPdruQbhrxs08pUq1gA69BmUontsEzrTuvkCh3V37SgSSXI29QYdGhjWhlZlvFKbqVxQmnyZ2YC/XIieWZQZ2w+5gXKcZSzLH+kgc0wgcqgAwIBEqKBwgSBvy4Aw/Pk2EvBdHJncY1qnI0vUNEv3u2LoTDjvzrUPibjJUgIh0I5GMxkYNt9ofxksYr5VYB6WpP8u5ZmCjYAtVrvE7tWCtBuJL4UloX8T6rsxbvomi9AZhP1sogYs8TRvl5OsrKSRvumYfbcvCcUzmW6lgG4n0JSo3LKbLBehaLbLdRoTiOEE6tuR/SvWWDerTzHbiDwY05LGjHd0vCkkOtPfjh8ycYFDmUTWUt4KdOZ4B27HT8nW+Xue9WMCcNe
> User-Agent: curl/7.29.0
> Host: test.example.org:12321
> Accept: */*
> content-type: application/json
> Content-Length: 290
>
* upload completely sent off: 290 out of 290 bytes
< HTTP/1.1 500 Server Error
< Content-Type: text/html
< Cache-Control: max-age=0
< Transfer-Encoding: chunked
< Server: Jetty(9.2.z-SNAPSHOT)
<
<!DOCTYPE html>
<html><head><title>Ring: Stacktrace</title><style type="text/css">/*
Copyright (c) 2008, Yahoo! Inc. All rights reserved.
Code licensed under the BSD License:
http://developer.yahoo.net/yui/license.txt
version: 2.6.0
*/
html{color:#000;background:#FFF;}body,div,dl,dt,dd,ul,ol,li,h1,h2,h3,h4,h5,h6,pre,code,form,fieldset,legend,input,textarea,p,blockquote,th,td{margin:0;padding:0;}table{border-collapse:collapse;border-spacing:0;}fieldset,img{border:0;}address,caption,cite,code,dfn,em,strong,th,var{font-style:normal;font-weight:normal;}li{list-style:none;}caption,th{text-align:left;}h1,h2,h3,h4,h5,h6{font-size:100%;font-weight:normal;}q:before,q:after{content:'';}abbr,acronym{border:0;font-variant:normal;}sup{vertical-align:text-top;}sub{vertical-align:text-bottom;}input,textarea,select{font-family:inherit;font-size:inherit;font-weight:inherit;}input,textarea,select{*font-size:100%;}legend{color:#000;}del,ins{text-decoration:none;}

body {
    font-family: sans-serif;
    background: #a00;
    padding: 1em;
}

#exception {
    background: #f2f2f2;
    color: #333;
    padding: 1em;
}

h1 {
    color: #800;
    font-size: 32pt;
    text-align: center;
    margin-bottom: 0.3em;
}

.message {
    font-size: 16pt;
    text-align: center;
    margin-bottom: 1em;
}

#causes h2 {
    font-size: 22pt;
    text-align: center;
    margin-bottom: 0.3em;
}

#causes h2 .class {
    color: #800;
}

#causes .message {
    font-size: 14pt;
}

.trace {
    width: 90%;
    margin: auto;
}

.trace table {
    width: 100%;
    font-size: 12pt;
    background: #dadada;
    border: 0.8em solid #dadada;
    margin-bottom: 1.5em;
}

.trace table tr.clojure {
    color: #222;
}

.trace table tr.java {
    color: #6a6a6a;
}

.trace td {
    padding-top: 0.4em;
    padding-bottom: 0.4em;
}

.trace td.method {
    padding-left: 1em;
    padding-right: 0.2em;
    text-aligh: left;
}

.trace td.source {
    padding-left: 0.2em;
    text-align: right;
}

.trace .views {
    width: 100%;
    background: #bcbcbc;
    padding: 0.5em 0;
}

.views .label, .views ul, .views li {
    display: inline-block;
}

.trace .views .label {
    padding: 0 1em;
}

.trace .views li {
    padding: 0 2em;
    cursor: pointer;
}
</style></head><body><div id="exception"><h1>org.ietf.jgss.GSSException</h1><div class="message">No valid credentials provided</div><div class="trace"><table><tbody><tr class="java"><td class="source">GSSCredentialImpl.java:86</td><td class="method">sun.security.jgss.GSSCredentialImpl.&lt;init&gt;</td></tr><tr class="java"><td class="source">GSSCredentialImpl.java:50</td><td class="method">sun.security.jgss.GSSCredentialImpl.&lt;init&gt;</td></tr><tr class="java"><td class="source">GSSManagerImpl.java:147</td><td class="method">sun.security.jgss.GSSManagerImpl.createCredential</td></tr><tr class="clojure"><td class="source">spnego.clj:62</td><td class="method">cook.spnego/gss-context-init</td></tr><tr class="clojure"><td class="source">spnego.clj:58</td><td class="method">cook.spnego/gss-context-init</td></tr><tr class="clojure"><td class="source">spnego.clj:81</td><td class="method">cook.spnego/require-gss[fn]</td></tr><tr class="clojure"><td class="source">stacktrace.clj:23</td><td class="method">ring.middleware.stacktrace/wrap-stacktrace-log[fn]</td></tr><tr class="clojure"><td class="source">stacktrace.clj:86</td><td class="method">ring.middleware.stacktrace/wrap-stacktrace-web[fn]</td></tr><tr class="clojure"><td class="source">components.clj:48</td><td class="method">cook.components/wrap-no-cache[fn]</td></tr><tr class="clojure"><td class="source">params.clj:64</td><td class="method">ring.middleware.params/wrap-params[fn]</td></tr><tr class="clojure"><td class="source">components.clj:119</td><td class="method">cook.components/health-check-middleware[fn]</td></tr><tr class="clojure"><td class="source">instrument.clj:54</td><td class="method">metrics.ring.instrument/instrument[fn]</td></tr><tr class="java"><td class="source">(Unknown Source)</td><td class="method">metrics.ring.instrument.proxy$java.lang.Object$Callable$7da976d4.call</td></tr><tr class="java"><td class="source">Timer.java:99</td><td class="method">com.codahale.metrics.Timer.time</td></tr><tr class="clojure"><td class="source">instrument.clj:53</td><td class="method">metrics.ring.instrument/instrument[fn]</td></tr><tr class="clojure"><td class="source">server.clj:70</td><td class="method">qbits.jet.server/make-handler[fn]</td></tr><tr class="java"><td class="source">(Unknown Source)</td><td class="method">qbits.jet.server.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle</td></tr><tr class="java"><td class="source">HandlerList.java:52</td><td class="method">org.eclipse.jetty.server.handler.HandlerList.handle</td></tr><tr class="java"><td class="source">HandlerCollection.java:110</td><td class="method">org.eclipse.jetty.server.handler.HandlerCollection.handle</td></tr><tr class="java"><td class="source">HandlerWrapper.java:97</td><td class="method">org.eclipse.jetty.server.handler.HandlerWrapper.handle</td></tr><tr class="java"><td class="source">Ser* Closing connection 0
ver.java:497</td><td class="method">org.eclipse.jetty.server.Server.handle</td></tr><tr class="java"><td class="source">HttpChannel.java:313</td><td class="method">org.eclipse.jetty.server.HttpChannel.handle</td></tr><tr class="java"><td class="source">HttpConnection.java:248</td><td class="method">org.eclipse.jetty.server.HttpConnection.onFillable</td></tr><tr class="java"><td class="source">AbstractConnection.java:540</td><td class="method">org.eclipse.jetty.io.AbstractConnection$2.run</td></tr><tr class="java"><td class="source">QueuedThreadPool.java:635</td><td class="method">org.eclipse.jetty.util.thread.QueuedThreadPool.runJob</td></tr><tr class="java"><td class="source">QueuedThreadPool.java:555</td><td class="method">org.eclipse.jetty.util.thread.QueuedThreadPool$3.run</td></tr><tr class="java"><td class="source">Thread.java:745</td><td class="method">java.lang.Thread.run</td></tr></tbody></table></div></div></body></html>% 

cook's access_log:

10.183.18.51 - - [04/Sep/2016:09:48:54 +0000] "POST /rawscheduler HTTP/1.1" 500 - "-" "curl/7.29.0" 7

cook's stderr:

org.ietf.jgss.GSSException: No valid credentials provided
                GSSCredentialImpl.java:86 sun.security.jgss.GSSCredentialImpl.<init>
                GSSCredentialImpl.java:50 sun.security.jgss.GSSCredentialImpl.<init>
                  GSSManagerImpl.java:147 sun.security.jgss.GSSManagerImpl.createCredential
                            spnego.clj:62 cook.spnego/gss-context-init
                            spnego.clj:58 cook.spnego/gss-context-init
                            spnego.clj:81 cook.spnego/require-gss[fn]
                        stacktrace.clj:23 ring.middleware.stacktrace/wrap-stacktrace-log[fn]
                        stacktrace.clj:86 ring.middleware.stacktrace/wrap-stacktrace-web[fn]
                        components.clj:48 cook.components/wrap-no-cache[fn]
                            params.clj:64 ring.middleware.params/wrap-params[fn]
                       components.clj:119 cook.components/health-check-middleware[fn]
                        instrument.clj:54 metrics.ring.instrument/instrument[fn]
                         (Unknown Source) metrics.ring.instrument.proxy$java.lang.Object$Callable$7da976d4.call
                            Timer.java:99 com.codahale.metrics.Timer.time
                        instrument.clj:53 metrics.ring.instrument/instrument[fn]
                            server.clj:70 qbits.jet.server/make-handler[fn]
                         (Unknown Source) qbits.jet.server.proxy$org.eclipse.jetty.server.handler.AbstractHandler$ff19274a.handle
                      HandlerList.java:52 org.eclipse.jetty.server.handler.HandlerList.handle
               HandlerCollection.java:110 org.eclipse.jetty.server.handler.HandlerCollection.handle
                   HandlerWrapper.java:97 org.eclipse.jetty.server.handler.HandlerWrapper.handle
                          Server.java:497 org.eclipse.jetty.server.Server.handle
                     HttpChannel.java:313 org.eclipse.jetty.server.HttpChannel.handle
                  HttpConnection.java:248 org.eclipse.jetty.server.HttpConnection.onFillable
              AbstractConnection.java:540 org.eclipse.jetty.io.AbstractConnection$2.run
                QueuedThreadPool.java:635 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob
                QueuedThreadPool.java:555 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run
                          Thread.java:745 java.lang.Thread.run

Any hints on how to next troubleshoot this on the cook's side?

Thanks! Matteo

m4ce commented 8 years ago

Hi @wyegelwel,

managed to get it to work eventually.

This is what I needed:

1) Move from OpenJDK to Oracle Java 2) Set up JAAS configuration

com.sun.security.jgss.accept {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
useTicketCache=false
keyTab="/opt/cook/etc/cook.headless.keytab"
principal="HTTP/test.example.org@EXAMPLE.ORG"
debug=true
isInitiator=false;
};

3) Cook's scheduler JVM options:

-Djava.security.krb5.conf=/opt/cook/etc/krb5.conf -Djava.security.auth.login.config=/opt/cook/etc/jaas.conf -Djavax.security.auth.useSubjectCredsOnly=false

4) Set up /opt/cook/etc/krb5.conf:

[libdefaults]
  default_realm = EXAMPLE.ORG
  dns_lookup_realm = false
  dns_lookup_kdc = false
  rnds = false
  ticket_lifetime = 24h
  udp_preference_limit = 0
  forwardable = yes

[realms]
  EXAMPLE.ORG = {
    kdc = kdc.example.org:88
    master_kdc = kdc.example.org:88
    default_domain = example.org

  }

[domain_realm]
  .example.org = EXAMPLE.ORG
  example.org = EXAMPLE.ORG

5) Downloaded the JCE policies and dropped them under /usr/java/default/lib/security (to solve the error Encryption type AES256 CTS mode with HMAC SHA1-96)

Hope this helps other people.