Closed ajay25 closed 4 years ago
I'm not sure what the problem is. The stack trace you shared is not related to listing databases. Are there any other errors in the logs?
You mention that all apps are kerberized, but I'm curious, what is the authentication between Knox & Hue? I see from the stack trace that doAs=myuser is passed. However, when kerberos between Knox & Hue is enabled, the user is passed via kerberos and not via doAs url parameter https://github.com/cloudera/hue/blob/d7ef03a47850c6626a89a14973c12c760813162d/desktop/core/src/desktop/middleware.py#L666
@jdesjean thanks for looking into this. I have pasted the logs here- https://gist.github.com/ajay25/484f156f984cd29027c13ecbf7f7c369
I'm using the "desktop.auth.backend.KnoxSpnegoDjangoBackend" as auth mechanism between knox and hue. As you can see in the gist, "myuser" is able to successfully login and hdfs directory is created using spnego protocol. All the subsequent calls are made with "myuser".
A couple things 1) From the log pasted, I see a statement about Impala being disabled. That would be worth looking into. 2) While KnoxSpnegoDjangoBackend is set on Hue side, I don't believe that kerberos authentication is set on Knox side. In the log I see:
[18/Dec/2019 12:16:29 -0800] middleware INFO Redirecting to login page: /?doAs=myuser
This indicates that Kerberos authentication failed. Furthermore, doAs is only passed as parameter when kerberos is not enabled. Can you confirm this is set:
knox/conf/gateway-site.xml
<configuration>
...
<property>
<name>gateway.hadoop.kerberos.secured</name>
<value>true</value>
</property>
...
</configuration>
For impala, I looked through the code and it seems the warning is just a notification to user. I even disabled the impala app/ comment the line where it tries to import the impala module, still nothing.
I didn't see any errors but still I'll investigate more on why the auth failed with kerberos. Also will do some background reading on doAs and in general kerberos auth in knox as I don't see anything missing. Below are set in gateway-site.xml
<property>
<name>gateway.hadoop.kerberos.secured</name>
<value>true</value>
<description>Boolean flag indicating whether the Hadoop cluster protected by Gateway is secured with Kerberos</description>
</property>
<property>
<name>java.security.krb5.conf</name>
<value>/etc/krb5.conf</value>
<description>Absolute path to krb5.conf file</description>
</property>
<property>
<name>java.security.auth.login.config</name>
<value>/etc/knox/conf/krb5JAASLogin.conf</value>
<description>Absolute path to JAAS login config file</description>
</property>`
For KnoxSpnegoBackend, Hue expects Knox to pass the doAs param- https://github.com/cloudera/hue/blob/master/desktop/core/src/desktop/auth/backend.py#L624 I'm not sure what you mean by "doAs is only passed as parameter when kerberos is not enabled".
IMO the authentication is working fine because hue prints out these-
[18/Dec/2019 12:16:29 -0800] backend INFO Materializing user myuser in the database
[18/Dec/2019 12:16:29 -0800] backend INFO Augmenting users with class: <class 'desktop.auth.backend.DefaultUserAugmentor'>
[18/Dec/2019 12:16:29 -0800] access WARNING 10.0.138.69 myuser - "GET / HTTP/1.1"-- Successful login for user: myuser
What I'm trying to understand is why does it fail when doing POST against jobsbrowser?
Did you see the Hue login page?
nope. From knox, it redirects to Idp page and from there on it directly logs me in as "myuser" in hue
Almost all the POST requests to notebook and jobbrowser APIs are missing the request body. For eg. in above case, jobbrowser API expect the format to be set to "json" but since that is missing, it follows a different code path and ultimately throws this error.
This seems similar to the issue mentioned here- https://community.cloudera.com/t5/Support-Questions/Knox-POST-request-dropping-Request-Body/td-p/239394
Can you suggest any workarounds? @jdesjean @romainr
@ajay25, we have not run into this issue.
Could you try setting knox/gateway/conf/gateway-log4j.properties
log4j.logger.org.apache.knox.gateway=DEBUG
and attach the log?
I did the same on my side and it looks like this
2019-12-28 21:38:33,677 DEBUG knox.gateway (GatewayFilter.java:doFilter(117)) - Received request: POST /hue/jobbrowser/jobs/
2019-12-28 21:38:33,678 DEBUG federation.jwt (SSOCookieFederationFilter.java:getJWTFromCookie(166)) - hadoop-jwt Cookie has been found and is being processed.
2019-12-28 21:38:33,679 DEBUG knox.gateway (HadoopGroupProviderFilter.java:mapGroupPrincipals(101)) - Found groups for principal csso_jgauthier : [....]
2019-12-28 21:38:33,679 DEBUG knox.gateway (AclsAuthorizationFilter.java:enforceAclAuthorizationPolicy(133)) - PrimaryPrincipal: csso_jgauthier
2019-12-28 21:38:33,679 DEBUG knox.gateway (AclsAuthorizationFilter.java:enforceAclAuthorizationPolicy(142)) - PrimaryPrincipal has access: true
2019-12-28 21:38:33,680 DEBUG knox.gateway (AclsAuthorizationFilter.java:enforceAclAuthorizationPolicy(147)) - GroupPrincipal has access: true
2019-12-28 21:38:33,680 DEBUG knox.gateway (AclsAuthorizationFilter.java:enforceAclAuthorizationPolicy(158)) - Remote IP Address: 127.0.0.1
2019-12-28 21:38:33,680 DEBUG knox.gateway (AclsAuthorizationFilter.java:enforceAclAuthorizationPolicy(160)) - Remote IP Address has access: true
2019-12-28 21:38:33,680 INFO knox.gateway (AclsAuthorizationFilter.java:doFilter(105)) - Access Granted: true
2019-12-28 21:38:33,680 DEBUG knox.gateway (UrlRewriteProcessor.java:rewrite(164)) - Rewrote URL: ...:443/mdor2312-dh/cdp-proxy/hue/jobbrowser/jobs/, direction: IN via explicit rule: HUE/hue/inbound/huerule to URL: ...:8889/jobbrowser/jobs
2019-12-28 21:38:33,681 DEBUG knox.gateway (DefaultDispatch.java:executeOutboundRequest(118)) - Dispatch request: POST ...:8889/jobbrowser/jobs?doAs=csso_jgauthier
2019-12-28 21:38:33,698 DEBUG knox.gateway (GatewayFilter.java:doFilter(117)) - Received request: GET /hue/editor
2019-12-28 21:38:33,698 DEBUG knox.gateway (DefaultDispatch.java:executeOutboundRequest(131)) - Dispatch response status: 200
2019-12-28 21:38:33,698 DEBUG federation.jwt (SSOCookieFederationFilter.java:getJWTFromCookie(166)) - hadoop-jwt Cookie has been found and is being processed.
2019-12-28 21:38:33,699 DEBUG knox.gateway (DefaultDispatch.java:getInboundResponseContentType(183)) - Using default character set UTF-8 for entity of type application/json
2019-12-28 21:38:33,699 DEBUG knox.gateway (DefaultDispatch.java:getInboundResponseContentType(195)) - Inbound response entity content type: application/json; charset=UTF-8
Pasted here- https://gist.github.com/ajay25/dceffec2e834ab4ec9a844055dc02c91 I'm pretty sure now that auth is not an issue.
I added some log statements here- https://github.com/cloudera/hue/blob/master/desktop/libs/notebook/src/notebook/api.py (ones starting with 'AJ') to confirm that POST payload is indeed missing.
For eg: when I click on refresh databases under SparkSql:
[30/Dec/2019 14:37:21 -0800] api INFO AJ post <QueryDict: {}> nb {} snippet {}
(https://github.com/cloudera/hue/blob/master/desktop/libs/notebook/src/notebook/api.py#L686
logging request.POST, notebook and snippet)
[30/Dec/2019 14:37:21 -0800] decorators ERROR Error running autocomplete
Traceback (most recent call last):
File "/usr/lib/hue/desktop/libs/notebook/src/notebook/decorators.py", line 105, in decorator
return func(*args, **kwargs)
File "/usr/lib/hue/desktop/libs/notebook/src/notebook/api.py", line 583, in autocomplete
autocomplete_data = get_api(request, snippet).autocomplete(snippet, database, table, column, nested)
File "/usr/lib/hue/desktop/libs/notebook/src/notebook/views.py", line 64, in get_api
return ApiWrapper(request, snippet)
File "/usr/lib/hue/desktop/libs/notebook/src/notebook/views.py", line 52, in __init__
self.api = _get_api(request, snippet)
File "/usr/lib/hue/desktop/libs/notebook/src/notebook/connectors/base.py", line 319, in get_api
if snippet['type'] == 'report':
KeyError: 'type'
I can see the payload in the browser for the above autocomplete API:
This only happens when we try to access Hue through knox. Direct access works fine (doesn't use any transfer encoding).
Thanks for the help @jdesjean I was able to figure out that this was happening due to consumption of http post input stream before doPost in Knox. So, Knox itself was sending empty body.
@ajay25 is there an action item here? Following up from your knox user mailing list email.
@risdenk thanks for following up. No action item from knox side either, as the offending code is something that we have customized internally and is not part of open source knox.
Environment: Knox 1.3.0 Hue 4.5 Hive 2.3.6 All of apps are kerberized.
In my setup, Knox intercepts all the requests to Hue and after successful auth redirects the calls with signed cookie to Hue.
As part of this, I updated following in hue.ini: [[knox]] knox_principal=knox@EC2.INTERNAL knox_ports=8442 knox_proxyhosts={hostname}:{port}
[[auth]] backend=desktop.auth.backend.KnoxSpnegoDjangoBackend
I’m able to load the Hue UI with logged in user as “myuser” but it is unable to load any databases (this works normally with other auth backends so permission is not an issue).
Stacktrace in hue logs: [16/Dec/2019 17:02:14 -0800] middleware INFO Processing exception: <WSGIRequest: POST ‘/jobbrowser/jobs?doAs=myuser’> is not JSON serializable: Traceback (most recent call last): File “/usr/lib/hue/build/env/lib/python2.7/site-packages/Django-1.11.22-py2.7.egg/django/core/handlers/base.py”, line 185, in _get_response response = wrapped_callback(request, *callback_args, *callback_kwargs) File “/usr/lib/hue/build/env/lib/python2.7/site-packages/Django-1.11.22-py2.7.egg/django/utils/decorators.py”, line 185, in inner return func(args, kwargs) File “/usr/lib/hue/apps/jobbrowser/src/jobbrowser/views.py”, line 178, in jobs ‘hiveserver2_impersonation_enabled’: hiveserver2_impersonation_enabled() File “/usr/lib/hue/desktop/core/src/desktop/lib/django_util.py”, line 227, in render return render_json(data, request.GET.get(“callback”), status=status) File “/usr/lib/hue/desktop/core/src/desktop/lib/django_util.py”, line 303, in render_json json = encode_json(data, indent) File “/usr/lib/hue/desktop/core/src/desktop/lib/django_util.py”, line 275, in encode_json return json.dumps(data, indent=indent, cls=Encoder) File “/usr/lib64/python2.7/json/init.py”, line 251, in dumps sort_keys=sort_keys, kw).encode(obj) File “/usr/lib64/python2.7/json/encoder.py”, line 209, in encode chunks = list(chunks) File “/usr/lib64/python2.7/json/encoder.py”, line 434, in _iterencode for chunk in _iterencode_dict(o, _current_indent_level): File “/usr/lib64/python2.7/json/encoder.py”, line 408, in _iterencode_dict for chunk in chunks: File “/usr/lib64/python2.7/json/encoder.py”, line 442, in _iterencode o = _default(o) File “/usr/lib/hue/desktop/core/src/desktop/lib/django_util.py”, line 77, in default return json.JSONEncoder.default(self, o) File “/usr/lib64/python2.7/json/encoder.py”, line 184, in default raise TypeError(repr(o) + " is not JSON serializable") TypeError: <WSGIRequest: POST ‘/jobbrowser/jobs?doAs=myuser’> is not JSON serializable
I do not see any errors in hive server logs.