cloudera / hue

Open source SQL Query Assistant service for Databases/Warehouses
https://cloudera.com
Apache License 2.0
1.16k stars 364 forks source link

Getting thrift timeouts with large number of tables in hive #153

Closed bilsch closed 9 years ago

bilsch commented 9 years ago

We have recently upgraded a hadoop cluster from cdh4 => cdh5.2 and also picked up a pretty big set of changes in hue.

The environment currently has approx 17k tables with a variety of column complexities ( eg some are simple with just a few and others are far more complex with 10+ columns ). The behavior introduced some time in the past with hue seems to have changed a few things.

The exception we tend to get:

[18/Feb/2015 19:23:09 +0000] thrift_util  WARNING  Not retrying thrift call ExecuteStatement due to socket timeout
[18/Feb/2015 19:23:09 +0000] thrift_util  INFO     Thrift saw a socket error: timed out

The old layout of hue ( at least in 2.2.0 not sure if/when the change took place ) has a simple editor for the main hive editor with separate a separate tab for the table listing. Each table had a preview button and so was far less complex.

The layout of the hive editor now has a drop-down to select databases ( default or users as I understand it ). Below the drop-down is a list of tables and an option to see sample data for each table. This seems to be loaded and cached, however the issue we are getting is that the volume of tables and cost of generating previews for all in a single http request - our user browsers time out ( web servers too ).

To "get by" we have bumped timeouts up but even now this is still not enough ( the times were bumped 11/Feb ).

I would suggest one of two possible fixes, certainly not exhaustive:

  1. Allow a config conditional to auto fetch preview data. Can default to enable, but at least we can disable and retain usability
  2. Simplify the page design to have the preview fetch occur as an async and have the browser poll in a separate http request, allowing the page to render ( and maybe put a "...loading" div in there )

Versions:

cdh 5.2.1 hue version is from 5.2.3 Host os: Centos 6.2 python: Python 2.6.6 java version "1.7.0_67" Java(TM) SE Runtime Environment (build 1.7.0_67-b01) Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

hive> show tables; (...) Time taken: 0.76 seconds, Fetched: 16455 row(s)

Not sure if you guys need/want any additional information.

romainr commented 9 years ago

The call made to HiveServer2 to list the tables is pretty slow, we recently changed it to 'show tables'.

Does it make it better at least for now? https://issues.cloudera.org/browse/HUE-2243

I think we return a max of 5000 tables by DB currently. We could up it but this is might create problems as it is a lot.

@enricoberti , I thought we were doing #2 already?

bilsch commented 9 years ago

Romain, I can give that patch a try. Assume its already committed to master yes?

romainr commented 9 years ago

Yes, you have the master commit id link in the JIRA

grisha commented 9 years ago

@romainr SHOW TABLES doesn't make any significantly better, still takes several minutes and causes our nginx to timeout. Ideally it should only try to load the currently selected database (it seems it's trying to load every single one at once), not all of them, and limit itself to no more than a 1000 tables or something like that (may be something configurable).

romainr commented 9 years ago

@enricoberti is going to check why it is not just loading just the tables of the currently selected DB

enricoberti commented 9 years ago

I just checked on master and the tables are loaded asynchronously and just for the selected database (and only if not cached already).

romainr commented 9 years ago

FYI: One timeout was not being set and could also deadlock: https://github.com/cloudera/hue/commit/26d954504a0e04455f591c55c03f05ff0ca6c691 https://github.com/cloudera/hue/commit/3d4a0e899f45fe13cde99f937d02b9c4b3668898