OTSDB-Hbase behaving wierd on loading 64 Million records file using bulk import

githubu1 commented 8 years ago

Hello,

We are trying to load a file using tsdb bulk import The file has 64 Million records for a single metric with 2 tags , time is on milliseconds ( time stamp ranges for about 12 hours) After loading 23 million records the loader failed with error

2016-03-30 14:25:23,776 ERROR [AsyncHBase I/O Worker #3] TextImporter: Exception caught while processing file testfile3.sorted.txt org.hbase.async.RemoteException: org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit, regionName=tsdb,,1459296973346.5cf05d4273f332ec2b44455accf79508., server=XYZ.com,60020,1459290105897, memstoreSize=268515440, blockingMemStoreSize=268435456 at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:3513) at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2732)

When i checked the regional server that has the region of tsdb , it is saying flushing ( now for more than 5 hours ...)

This is the output from Regional server basic information:

Flushing tsdb,,1459296973346.5cf05d4273f332ec2b44455accf79508. RUNNING (since 5hrs, 7mins, 35sec ago) Flushing t: closing flushed file (since 5hrs, 7mins, 35sec ago)

But the strage thing is the the hdfs is increased 5 Tb (no replication of 1.7Tb) and still growing .... ( there is a .tmp directory under the regional server that grows)

This is the hdfs command that shows that the size is growing

hadoop fs -du -h /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/ 39 117 /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/.regioninfo 1.7 T 5.1 T /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/.tmp 0 0 /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/recovered.edits 122.8 M 368.3 M /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/t

We have a hbase cluster with 12 regional servers and all the settings of hbase are default ( memstore size is 128M , storefile 3 ). hbase.hregion.memstore.flush.size=128M hbase.hregion.memstore.block.multiplier=2 hbase.regionserver.global.memstore.lowerLimit=0.38 hbase.regionserver.global.memstore.upperLimit=0.4 hbase.hstore.compactionThreshold=3 hbase.hstore.blockingStoreFiles=10 hbase.hstore.compaction.max=30 hbase.hregion.majorcompaction=7days hbase.hstore.blockingWaitTime=300

CDH=5.4.5 Hbase=1.0.0 OPTSD=2.2

_Few questions _

Why the HDFS is growing exponentially (so far 1.7 T) where the total data we loaded is in lessthan 3Gb ? Is there any suggestions from you on Hbase settings for OTSDB ? What is your suggestion to load this datafile ( One Metric , 64 Million records , timestamp is milliseconds with 2 tags) Are we hitting any limits ?

Thanks

githubu1 commented 8 years ago

The data on .tmp folder is still increasing ( now it is 5.7T , my data is couple of GB ) i will re-start the Hbase or re-create the tsdb table with no compression ( currently it has snappy)

5.8 T 17.3 T /hbase/data/default/tsdb/5cf05d4273f332ec2b44455accf79508/.tmp/ddea690f002a483584e208703e5ca479

manolama commented 8 years ago

Ah, this is a bug in HBase and if you search the mailing list I think there was a solution around tuning the block size or something in HBase. It seems to only happen when using milliseconds as it increases the amount of data stored in each row.

HariSekhon commented 8 years ago

FYI this is solved in HDP 2.5, I've upgraded and tested it (contains a backported patch for it), otherwise see HBASE-16288 for which upstream Apache HBase versions contain the fix.

IDerr commented 7 years ago

@manolama you can close this since it's not an issue related to opentsbd

@johann8384 @manolama

IDerr commented 7 years ago

Hi @githubu1 since it's not an issue related to opentsdb, could you close this issue please :)

Thanks :D

OpenTSDB / opentsdb

OTSDB-Hbase behaving wierd on loading 64 Million records file using bulk import #757