isar / hive

Lightweight and blazing fast key-value database written in pure Dart.
Apache License 2.0
4.08k stars 404 forks source link

After "insert overwrite" the data, the old data is not delete! #272

Open lookingGordo opened 4 years ago

lookingGordo commented 4 years ago

Question After the hive repository overwrites the inserted data, the data that should be overwritten is not deleted.What's going on here?

Code sample explain: temp_topic table: There are two fields, 260 rows of data temp_topic2 table: lik temp_topic table

The SQL is executed in the following order: ` create table temp_topic2 like temp_topic stored as textfile; insert overwrite table temp_topic2 select from temp_topic; select count() from temp_topic2; ==> result:260 √

insert overwrite table temp_topic2 select "test data" ,4; select count(*) from temp_topic2; ==> result:1 √

hdfs dfs -ls /XXX/test.db/temp_topic2 /XXX/test.db/temp_topic2/base_0000001 [This folder should be deleted. But he didn't] /XXX/test.db/temp_topic2/base_0000002

hdfs dfs -rm /XXX/test.db/temp_topic2/base_0000002

select count(*) as a from temp_topic2; [Prevent SQL from caching results] ==> result: 260 ×

There comes the problem : The data in base_000001 as the temp_topic table was not deleted. Maybe there's a parameter here that causes. But I don't know! Please tell me `

Version

Mravuri96 commented 4 years ago

@lookingGordo Hive is an Append only database, you should perform a manual compaction, to solve your issue. Read this Hive Docs for more info