apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.39k stars 3.68k forks source link

Kafka index service use a lot of direct memory during segment publish #7824

Closed quenlang closed 5 years ago

quenlang commented 5 years ago

Hi, all I found a problem when i use kafka index service. The task process uses a lot of direct memory when publishing segments, almost 6x than ingesting data stage. I'm using druid-0.13.0.

At ingesting time, RES of the task is 1.647g.

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                             
29321 root      20   0 12.691g 1.647g  30952 S   8.7  5.3   3:52.96 java

And the heapnon-heapdirect are 221M、120M and 864M

Memory                                     used           total         max            usage         GC                                                                                                  
heap                                       221M           502M          2048M          10.80%        gc.g1_young_generation.count                      134                                               
g1_eden_space                              43M            248M          -1             17.34%        gc.g1_young_generation.time(ms)                   1146                                              
g1_survivor_space                          38M            38M           -1             100.00%       gc.g1_old_generation.count                        0                                                 
g1_old_gen                                 140M           216M          2048M          6.85%         gc.g1_old_generation.time(ms)                     0                                                 
nonheap                                    120M           124M          -1             96.74%                                                                                                            
code_cache                                 40M            42M           240M           17.05%                                                                                                            
metaspace                                  70M            72M           -1             96.62%                                                                                                            
compressed_class_space                     9M             9M            1024M          0.91%                                                                                                             
direct                                     864M           864M          -              100.00%

But at publishing time, RES of the task is 6.046g

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                             
29321 root      20   0 17.002g 6.046g 109592 S 142.7 19.3   5:31.43 java

And the heapnon-heapdirect are 266M、124M and 5204M

Memory                                     used           total         max            usage         GC                                                                                                  
heap                                       266M           502M          2048M          13.01%        gc.g1_young_generation.count                      187                                               
g1_eden_space                              13M            198M          -1             6.57%         gc.g1_young_generation.time(ms)                   1802                                              
g1_survivor_space                          4M             4M            -1             100.00%       gc.g1_old_generation.count                        0                                                 
g1_old_gen                                 249M           300M          2048M          12.18%        gc.g1_old_generation.time(ms)                     0                                                 
nonheap                                    124M           134M          -1             92.48%                                                                                                            
code_cache                                 44M            44M           240M           18.47%                                                                                                            
metaspace                                  70M            78M           -1             89.42%                                                                                                            
compressed_class_space                     9M             10M           1024M          0.91%                                                                                                             
direct                                     5204M          5204M         -              100.00%

The direct memory increase 864M to 5204M when publishing begins. How can i control the usage of direct memory?

Any suggestion? Thanks!

middleManager runtime.properties

druid.service=druid/middleManager
druid.port=8091

# Number of tasks per middleManager
druid.worker.capacity=10

# Task launch parameters
druid.indexer.runner.javaOpts=-server -Xmx2g -XX:MaxDirectMemorySize=10240g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager
druid.indexer.task.baseTaskDir=var/druid/task

# HTTP server threads
druid.server.http.numThreads=50

# Processing threads and buffers on Peons
druid.indexer.fork.property.druid.processing.buffer.sizeBytes=100000000
druid.indexer.fork.property.druid.processing.numThreads=7
druid.indexer.task.restoreTasksOnRestart=true

# Hadoop indexing
druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp
druid.indexer.task.defaultHadoopCoordinates=["org.apache.hadoop:hadoop-client:2.7.3"]

Datasource has 35 dimensions and 45 metrics. The spec file

{
    "type": "kafka",
    "dataSchema": {
        "dataSource": "MY_DATASOURCE_DAY",
        "parser": {
            "type": "string",
            "parseSpec": {
                "format": "json",
                "timestampSpec": {
                    "column": "timestamp",
                    "format": "millis"
                },
                "dimensionsSpec": {
                    "dimensions": [
            "mp_id",
            "application_id",
            "instance_id",
            "action_name",
            "message_type",
            "from_path",
            "path",
            "open_path",
            "close_path",
            "scene",
            "country_id",
            "region_id",
            "city_id",
            "carrier_id",
            "error_message",
            "error_filename",
            "request_method",
            "host",
            "uri",
            "network_type",
            "wechat_version",
            "route_chain",
            "uid",
            "http_code",
            "system",
            "ip",
            "device_type",
            "agreement_id",
            "protocol",
                        "error_type",
                        "error_detail",
                        "custom_code",
            "api_name",
                        "opera_name",
                        "offset_left",
                        "offset_top",
            "agent_version"
                    ]
                }
            }
        },
        "metricsSpec": [
            {
                "name": "server_count",
                "fieldName": "server_count",
                "type": "longSum"
            },
            {
                "name": "quit_count",
                "fieldName": "quit_count",
                "type": "longSum"
            },
            {
                "name": "on_ready",
                "fieldName": "on_ready",
                "type": "doubleSum"
            },
            {
                "name": "custom_time",
                "fieldName": "custom_time",
                "type": "longSum"
            },
            {
                "name": "first_response_time",
                "fieldName": "first_response_time",
                "type": "doubleSum"
            },
            {
                "name": "response_time",
                "fieldName": "response_time",
                "type": "doubleSum"
            },
            {
                "name": "application_server_time",
                "fieldName": "application_server_time",
                "type": "doubleSum"
            },
            {
                "name": "network_time",
                "fieldName": "network_time",
                "type": "doubleSum"
            },
            {
                "name": "callback_time",
                "fieldName": "callback_time",
                "type": "longSum"
            },
            {
                "name": "bytes_sent",
                "fieldName": "bytes_sent",
                "type": "longSum"
            },
            {
                "name": "bytes_received",
                "fieldName": "bytes_received",
                "type": "longSum"
            },
            {
                "name": "msg_error_pv",
                "fieldName": "msg_error_pv",
                "type": "longSum"
            },
            {
                "name": "file_error_pv",
                "fieldName": "file_error_pv",
                "type": "longSum"
            },
            {
                "name": "count",
                "fieldName": "count",
                "type": "longSum"
            },
            {
                "name": "on_ready_count",
                "fieldName": "on_ready_count",
                "type": "longSum"
            },
            {
                "name": "open_count",
                "fieldName": "open_count",
                "type": "longSum"
            },
            {
                "name": "net_count",
                "fieldName": "net_count",
                "type": "longSum"
            },
            {
                "name": "net_error_count",
                "fieldName": "net_error_count",
                "type": "longSum"
            },
            {
                "name": "js_error_count",
                "fieldName": "js_error_count",
                "type": "longSum"
            },
            {
                "name": "slow_count",
                "fieldName": "slow_count",
                "type": "longSum"
            },
            {
                "name": "net_slow_count",
                "fieldName": "net_slow_count",
                "type": "longSum"
            },
            {
                "name": "custom_success_count",
                "fieldName": "custom_success_count",
                "type": "longSum"
            },
            {
                "name": "uv",
                "fieldName": "uid",
                "type": "thetaSketch"
            },
            {
                "name": "api_success_count",
                "fieldName": "api_success_count",
                "type": "longSum"
            },
            {
                "name": "api_fail_count",
                "fieldName": "api_fail_count",
                "type": "longSum"
            },
            {
                "name": "api_count",
                "fieldName": "api_count",
                "type": "longSum"
            },
            {
                "name": "api_cancel_count",
                "fieldName": "api_cancel_count",
                "type": "longSum"
            },
            {
                "type": "longSum",
                "name": "opera_time",
                "fieldName": "opera_time",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_request_time",
                "fieldName": "opera_request_time",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_server_time",
                "fieldName": "opera_server_time",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_count",
                "fieldName": "opera_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_request_count",
                "fieldName": "opera_request_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_success_count",
                "fieldName": "opera_success_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_fail_count",
                "fieldName": "opera_fail_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_slow_count",
                "fieldName": "opera_slow_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "opera_server_count",
                "fieldName": "opera_server_count",
                "expression": null
            },
            {
                "type": "longSum",
                "name": "page_net_error_count",
                "fieldName": "page_net_error_count",
                "expression": null
            },
        {
            "type" : "quantilesDoublesSketch",
            "name" : "on_ready_sketch",
            "fieldName" : "on_ready",
            "k": 256
        },
        {
            "type" : "quantilesDoublesSketch",
            "name" : "first_response_time_sketch",
            "fieldName" : "first_response_time",
            "k": 256
        },
        {
            "type" : "quantilesDoublesSketch",
            "name" : "response_time_sketch",
            "fieldName" : "response_time",
            "k": 256
        },
        {
            "type" : "quantilesDoublesSketch",
            "name" : "network_time_sketch",
            "fieldName" : "network_time",
            "k": 256
        },
        {
            "type" : "quantilesDoublesSketch",
            "name" : "application_server_time_sketch",
            "fieldName" : "application_server_time",
            "k": 256
        },
            {
                "type": "quantilesDoublesSketch",
                "name": "opera_time_sketch",
                "fieldName": "opera_time",
                "k": 256
            },
            {
                "type": "quantilesDoublesSketch",
                "name": "opera_request_time_sketch",
                "fieldName": "opera_request_time",
                "k": 256
            },
            {
                "type": "quantilesDoublesSketch",
                "name": "opera_server_time_sketch",
                "fieldName": "opera_server_time",
                "k": 256
            }
        ],
        "granularitySpec": {
            "type": "uniform",
            "segmentGranularity": "DAY",
            "queryGranularity": "DAY"
        }
    },
    "tuningConfig": {
        "type": "kafka",
        "maxRowsPerSegment": 2500000,
    "resetOffsetAutomatically": true,
        "targetPartitionSize": 2500000
    },
    "ioConfig": {
        "topic": "drd-mp-webchat-applet",
        "consumerProperties": {
            "bootstrap.servers": "kafka-server:9092"
        },
        "taskCount": 1,
        "taskDuration": "PT1H",
        "replicas": 1,
    "useEarliestOffset": true
    }
}
quenlang commented 5 years ago

I found a lot of Allocating new littleEndByteBuf operations in the log. Finally, allocated 333023 littleEndByteBuf and the usage of direct memory arrived 18GB, and OS killed this task. The ingested data was very small, just 2GB in six hours, and all the dimensions have very low cardinality. Why the task need so many direct memory? @himanshug Can you give me some advices? Thank you so much!

2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,020]
2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,021]
2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,022]
2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,023]
2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,024]
2019-06-14T06:24:34,445 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,025]
2019-06-14T06:24:34,446 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,026]
2019-06-14T06:24:34,446 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,027]
2019-06-14T06:24:34,446 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,028]
2019-06-14T06:24:34,446 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,029]
himanshug commented 5 years ago

Unfortunately, It is very hard to offer concrete advice on these things, I can give some general points for you to further investigate.

OS killed this task

I am assuming that means linux OOM killer killed the process. If yes, then it probably can't be the memory used to load the segments in page cache as that doesn't count towards anonymous memory usage. So, that means the process really allocated loads of direct memory using those buffers you mentioned. There could be a genuine memory leak there. You can identify that by doing a manual GC (e.g. using "jcmd GC.run" ) at time of high direct memory use, if GC frees up a whole bunch of direct memory then there is almost certainly a memory leak and we are not "closing/cleaning" discarded off-heap buffers. If you find that, you can try to then identify which buffers are we not cleaning up and/or create another ticket with your task log and analysis , someone else from community might be able to look into it. If GC does not free significant direct memory that means all that memory is really in use and code needs optimization to reduce its memory usage. My suspicion is that, most of that off-heap buffers are probably being used to merge the intermediate persisted segment merging , playing with some of configuration described in https://druid.apache.org/docs/latest/ingestion/native_tasks.html#tuningconfig might increase/decrease off heap memory usage. (especially maxRowsInMemory, maxBytesInMemory, maxTotalRows .. try reducing maxTotalRows )

Check whether segment handoffs are working fine, you can add "pushTimeout" setting from above if need be.

you might be hitting issue described in https://github.com/apache/incubator-druid/pull/6699 , if you can build druid with that patch applied then try that.

That said, please attache the task log of the process that ends up dead due to too much memory usage. I will take a quick look to see if there is something obvious that shows up.

HTH.

quenlang commented 5 years ago

@himanshug Thank you so much for the quick replay! I had performed a manual GC with using jcmd GC.run, but did not free up the direct memory. The task log of the process that ends up dead due to too much memory usage as below. Thanks !

2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6865}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6866}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6867}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6868}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6869}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6870}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6871}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6872}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6873}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6874}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6875}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6876}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6877}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6878}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6879}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6880}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6881}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6882}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6883}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6884}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6885}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6886}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6887}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6888}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6889}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6890}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6891}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6892}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6893}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6894}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6895}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6896}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6897}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6898}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6899}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6900}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6901}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6902}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6903}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6904}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6905}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6906}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6907}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6908}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6909}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6910}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6911}]
2019-06-14T06:24:10,907 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6912}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6913}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6914}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6915}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6916}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6917}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6918}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6919}]
2019-06-14T06:24:10,908 INFO [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=ZJK_TEST_2019-06-13T06:00:00.000Z_2019-06-13T12:00:00.000Z_2019-06-14T06:17:24.368Z, count=6920}]
2019-06-14T06:24:11,090 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMergerV9 - Using SegmentWriteOutMediumFactory[TmpFileSegmentWriteOutMediumFactory]
2019-06-14T06:24:11,091 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMergerV9 - Completed version.bin in 0 millis.
2019-06-14T06:24:11,091 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMergerV9 - Completed factory.json in 0 millis
2019-06-14T06:24:11,415 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[20]
2019-06-14T06:24:11,416 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[12]
2019-06-14T06:24:11,419 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[4]
2019-06-14T06:24:11,419 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[4]
2019-06-14T06:24:11,419 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[4]
2019-06-14T06:24:13,376 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[168]
2019-06-14T06:24:13,376 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[168]
2019-06-14T06:24:13,376 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMerger - Allocating dictionary merging direct buffer with size[64]
****************** many many many Allocating dictionary merging direct buffer with size [*] *******************
2019-06-14T06:24:13,987 INFO [appenderator_merge_0] org.apache.druid.segment.StringDimensionMergerV9 - Completed dim[uri] conversions with cardinality[1,084] in 722 millis.
2019-06-14T06:24:13,988 INFO [appenderator_merge_0] org.apache.druid.segment.IndexMergerV9 - Completed dim conversions in 2,596 millis.
2019-06-14T06:24:20,682 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[53]
2019-06-14T06:24:20,682 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[54]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[55]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[56]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[57]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[58]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[59]
2019-06-14T06:24:20,683 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[60]
****************** many many many Allocating new littleEndByteBuf [*] until 333,029 *******************
2019-06-14T06:24:34,446 INFO [appenderator_merge_0] org.apache.druid.segment.CompressedPools - Allocating new littleEndByteBuf[333,029]
himanshug commented 5 years ago

so there were 6920 segments to be merged together with 80 columns each, merging that does need alot of those buffers, about 6920*80 . I think there are too many intermediate persists because of maxBytesInMemory which in your case , by default, is 2g/6 = 341M . OnHeapIncrementalIndex does overestimate the size due to sketches in there and might be persisting segments too aggressively.

Here is what I would try out...

set maxBytesInMemory = Integer.MAX_VALUE (to effectively disable it) adjust maxRowsInMemory (default 1000000) to a lower value or increase Xmx on task process so that it doesn't get into too much of GC . You might have to adjust druid.worker.capacity=10 to a lower value if Xmx was increased. Now find the right balance of maxRowsInMemory and Xmx on task process so that you don't end up with thousands of segments to merge in the end but much lower number. That will drive down the off-heap buffer allocation . (Higher maxRowsInMemory means lower the number of segments to merge in the end but higher jvm heap utilization )

quenlang commented 5 years ago

@himanshug Thank you so much! I had set maxBytesInMemory = Integer.MAX_VALUE in my task and the number of segments reduce to 1500. Finally, the task process works very well. direct memory used 4GB(originally used almost 20GB) when the process handoff segments. Thank you! Have a good day!

clintropolis commented 5 years ago

likely related #6743

himanshug commented 5 years ago

https://github.com/apache/incubator-druid/pull/7919 should further reduce the memory utilization . however that PR is independent and I am closing this bug.