StarRocks / starrocks

StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
https://starrocks.io
Apache License 2.0
8.29k stars 1.68k forks source link

Hive external execute failed,how can I set User? #770

Closed winfys closed 2 years ago

winfys commented 2 years ago

starrocks version:1.18.1

image

log: 2021-10-20 11:34:15 ERROR TThreadPoolServer:321 - Thrift Error occurred during processing of message. org.apache.thrift.transport.TTransportException: java.net.SocketException: Connection reset at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:455) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:354) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:243) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:125) ... 9 more

Ranger is configured for Hive but Kerberos is not enabled, how can I set User?

gengjun-git commented 2 years ago

It does not seem to be a permission issue, because some queries can be successful. What is deployed on this machine 172.17.187.167

gengjun-git commented 2 years ago

Are hdfs and starrocks cluster networks connected?

winfys commented 2 years ago

Are hdfs and starrocks cluster networks connected? Machine 172.17.187.167 deploys BE and Broker , networks connection is normal. I see hiveserver log is a permission problem, anonymous users are not allowed to access. How do I set user to access the Hive external table?

gengjun-git commented 2 years ago

Has the hive external table query been successful?

gengjun-git commented 2 years ago

You can set the HADOOP_USER_NAME variable in fe/conf/hadoop_env.sh and be/conf/hadoop_env.sh

winfys commented 2 years ago

Has the hive external table query been successful?

I set the HADOOP_USER_NAME , but SQL can only execute two before reporting an error. hiveserver is ha ,Whether the relevant? image image

gengjun-git commented 2 years ago

Seems that it's not the permission problem. Can you show the plan for the sql? Execute the below command for the failed SQL

EXPLAIN SQL
winfys commented 2 years ago

image

winfys commented 2 years ago

I in another starkrocks cluster, check the same data to perform successfully, but there is another problem, 300 g of parquet file, hive is a partition table, do not specify a partition queries, be to hang out, how can I optimize be : 3 node 24core * 88g

gengjun-git commented 2 years ago

ping @kangkaisen @dirtysalt

gengjun-git commented 2 years ago

Use the command below to set query_timeout to a bigger value (unit is second), And check whether the sql can be executed successfully?

set query_timeout = xxx
winfys commented 2 years ago

be,out: image

winfys commented 2 years ago

Thank you,The reason has been found. There is a problem with the JDK configuration. Do you need to specify the partition field for the hive partition appearance? There is no partition field error in the where condition

gengjun-git commented 2 years ago

You mean the partition filed must be set in where clause?

winfys commented 2 years ago

image image

winfys commented 2 years ago

starrocks version:1.18.4

winfys commented 2 years ago

No valuable information is found in log

winfys commented 2 years ago

image hive data reload hive:parquet snnapy After the table is rebuilt, the query is normal,How do I refresh metadata?Does the table need to be rebuilt if a new partition is added or data is overwritten

gengjun-git commented 2 years ago

You can refresh the meta cache according to this docs https://github.com/StarRocks/docs.zh-cn/pull/65/files

gengjun-git commented 2 years ago

Our discussion is beyond the scope of the issue title, you can make another issue or using our forum