Question
After the hive repository overwrites the inserted data, the data that should be overwritten is not deleted.What's going on here?
Code sample
explain:
temp_topic table: There are two fields, 260 rows of data
temp_topic2 table: lik temp_topic table
The SQL is executed in the following order:
`
create table temp_topic2 like temp_topic stored as textfile;
insert overwrite table temp_topic2 select from temp_topic;
select count() from temp_topic2; ==> result:260 √
hdfs dfs -ls /XXX/test.db/temp_topic2
/XXX/test.db/temp_topic2/base_0000001 [This folder should be deleted. But he didn't]
/XXX/test.db/temp_topic2/base_0000002
select count(*) as a from temp_topic2; [Prevent SQL from caching results] ==> result: 260 ×
There comes the problem : The data in base_000001 as the temp_topic table was not deleted.
Maybe there's a parameter here that causes. But I don't know! Please tell me
`
Question After the hive repository overwrites the inserted data, the data that should be overwritten is not deleted.What's going on here?
Code sample explain: temp_topic table: There are two fields, 260 rows of data temp_topic2 table: lik temp_topic table
The SQL is executed in the following order: ` create table temp_topic2 like temp_topic stored as textfile; insert overwrite table temp_topic2 select from temp_topic; select count() from temp_topic2; ==> result:260 √
insert overwrite table temp_topic2 select "test data" ,4; select count(*) from temp_topic2; ==> result:1 √
hdfs dfs -ls /XXX/test.db/temp_topic2 /XXX/test.db/temp_topic2/base_0000001 [This folder should be deleted. But he didn't] /XXX/test.db/temp_topic2/base_0000002
hdfs dfs -rm /XXX/test.db/temp_topic2/base_0000002
select count(*) as a from temp_topic2; [Prevent SQL from caching results] ==> result: 260 ×
There comes the problem : The data in base_000001 as the temp_topic table was not deleted. Maybe there's a parameter here that causes. But I don't know! Please tell me `
Version