elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
69.94k stars 24.74k forks source link

CSV River strange behavior #9835

Closed proxylab closed 9 years ago

proxylab commented 9 years ago

Hi all,

I have recently replaced my bulk import mechanism (PHP and bulk API) with river csv. What I have noticed so far is a strange behavior that shows up after a certain index size (around 10.000.000 docs and ~1.5G disk size). So when the index is small everything works fine, I have set the bulk_size=1000, concurrent_requests=4 and bulk_threashold=10. After a couple of hours when index become bigger the whole process slows down and the import of .csv files becomes really slow. I have checked the elastic .log files and I figured out that the execution circle (polling time) of the import is interrupted. For instance here is what I get from the logs

logs

[2015-02-23 20:08:55,135][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File has been processed 54eb7eed2bbe9.csv.processing [2015-02-23 20:08:55,136][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File 54eb7eed2bbe9.csv.processing, processed lines 2300 [2015-02-23 20:08:55,137][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Processing file 54eb81de7e37f.csv [2015-02-23 20:08:55,146][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:09:52,079][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: [SocketTimeoutException[Read timed out]] [2015-02-23 20:09:54,170][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:09:54,286][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:10:41,762][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:10:41,911][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:10:52,411][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: SocketTimeoutException[Read timed out] [2015-02-23 20:11:37,582][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:11:37,758][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File has been processed 54eb81de7e37f.csv.processing [2015-02-23 20:11:37,759][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File 54eb81de7e37f.csv.processing, processed lines 2985 [2015-02-23 20:11:37,759][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Processing file 54eb807bf351c.csv [2015-02-23 20:11:37,765][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:12:02,830][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: SocketTimeoutException[Read timed out] [2015-02-23 20:12:30,479][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:12:30,536][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:13:03,132][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: [SocketTimeoutException[Read timed out]] [2015-02-23 20:13:24,458][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:13:24,581][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:14:03,423][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: SocketTimeoutException[Read timed out] [2015-02-23 20:14:12,914][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:14:13,010][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File has been processed 54eb807bf351c.csv.processing [2015-02-23 20:14:13,010][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] File 54eb807bf351c.csv.processing, processed lines 2924 [2015-02-23 20:14:13,011][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Processing file 54eb7eb509a30.csv [2015-02-23 20:14:13,032][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:15:11,204][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Executed bulk composed of 1000 actions [2015-02-23 20:15:11,311][INFO ][org.agileworks.elasticsearch.river.csv.CSVRiver] [Domina] [csv][maxweb] Going to execute new bulk composed of 1000 actions [2015-02-23 20:15:13,741][ERROR][marvel.agent.exporter ] [Domina] error sending data to [http://[0:0:0:0:0:0:0:0]:9200/.marvel-2015.02.23/_bulk]: SocketTimeoutException[Read timed out]

As you can see there is no accuracy between time periods. The one circle ends at 2015-02-23 20:13:24 and the next start at 2015-02-23 20:14:12. Next you can find the csv river and index settings

CSV River

{
    "type": "csv",
    "csv_file": {
        "folder": "/vagrant/CSV/",
        "filename_pattern": ".*\.csv$",
        "poll": "1m",
        "fields": [
            "serverId",
            "duration",
            "requestTime",
            "responseTime",
            "statementType",
            "isRealQuery",
            "queryFailed",
            "sqlQuery",
            "transactionId",
            "clientName",
            "serverName",
            "serverUniqueName",
            "affectedTables",
            "queryError",
            "canonCommandType",
            "canonicalId"
        ],
        "first_line_is_header": "false",
        "concurrent_requests": "4",
        "charset": "UTF-8"
    },
    "index": {
        "index": "maxweb",
        "type": "queries",
        "bulk_size": 1000,
        "bulk_threshold": 10
    }
}

Index mapping

{
    "mappings": {
        "queries": {
            "transform": {
                "script": "ctx._source['affectedTables'] = ctx._source['affectedTables']?.tokenize(',')",
                "lang": "groovy"
            },
            "_all": {
                "enabled": false
            },
            "_source": {
                "compress": false
            },
            "properties": {
                "affectedTables": {
                    "type": "string",
                    "index": "not_analyzed",
                    "copy_to": [
                        "suggest_tables"
                    ]
                },
                "canonCommandType": {
                    "type": "integer",
                    "index": "no"
                },
                "canonicalId": {
                    "type": "string",
                    "index": "no"
                },
                "clientName": {
                    "type": "string",
                    "index": "no"
                },
                "duration": {
                    "type": "double",
                    "doc_values": true
                },
                "isRealQuery": {
                    "type": "string"
                },
                "queryError": {
                    "type": "string",
                    "index": "no"
                },
                "queryFailed": {
                    "type": "boolean"
                },
                "requestTime": {
                    "type": "double",
                    "doc_values": true
                },
                "responseTime": {
                    "type": "double",
                    "index": "no"
                },
                "serverId": {
                    "type": "long",
                    "doc_values": true
                },
                "serverName": {
                    "type": "string",
                    "index": "no"
                },
                "serverUniqueName": {
                    "type": "string",
                    "index": "no"
                },
                "sqlQuery": {
                    "type": "string",
                    "norms": {
                        "enabled": false
                    }
                },
                "statementType": {
                    "type": "integer",
                    "doc_values": true
                },
                "suggest_tables": {
                    "type": "completion",
                    "analyzer": "simple",
                    "payloads": false,
                    "preserve_separators": true,
                    "preserve_position_increments": true,
                    "max_input_length": 50
                }
            }
        }
    }
}

elasticasearch.yml

index.refresh_interval: 30s index.translog.flush_threshold_ops: 50000 index.translog.flush_threshold_size: 512mb indices.fielddata.cache.size: 20% indices.cache.filter.size: 20% indices.memory.index_buffer_size: 40% index.merge.scheduler.max_thread_count : 1 bootstrap.mlockall: true

/etc/sysconfig/elasticsearch

MAX_LOCKED_MEMORY=unlimited MAX_OPEN_FILES=65535 ES_JAVA_OPTS=-server ES_HEAP_SIZE=512m

index status

{
    "_shards": {
        "total": 1,
        "successful": 1,
        "failed": 0
    },
    "indices": {
        "maxweb": {
            "index": {
                "primary_size_in_bytes": 3413521092,
                "size_in_bytes": 3413521092
            },
            "translog": {
                "operations": 4423
            },
            "docs": {
                "num_docs": 17886624,
                "max_doc": 17886624,
                "deleted_docs": 0
            },
            "merges": {
                "current": 0,
                "current_docs": 0,
                "current_size_in_bytes": 0,
                "total": 29,
                "total_time_in_millis": 28204,
                "total_docs": 262490,
                "total_size_in_bytes": 60109517
            },
            "refresh": {
                "total": 158,
                "total_time_in_millis": 15612
            },
            "flush": {
                "total": 3,
                "total_time_in_millis": 23029
            },
            "shards": {
                "0": [{
                    "routing": {
                        "state": "STARTED",
                        "primary": true,
                        "node": "oYGAJctoTTmSU1wD021byA",
                        "relocating_node": null,
                        "shard": 0,
                        "index": "maxweb"
                    },
                    "state": "STARTED",
                    "index": {
                        "size_in_bytes": 3413521092
                    },
                    "translog": {
                        "id": 1424570700536,
                        "operations": 4423
                    },
                    "docs": {
                        "num_docs": 17886624,
                        "max_doc": 17886624,
                        "deleted_docs": 0
                    },
                    "merges": {
                        "current": 0,
                        "current_docs": 0,
                        "current_size_in_bytes": 0,
                        "total": 29,
                        "total_time_in_millis": 28204,
                        "total_docs": 262490,
                        "total_size_in_bytes": 60109517
                    },
                    "refresh": {
                        "total": 158,
                        "total_time_in_millis": 15612
                    },
                    "flush": {
                        "total": 3,
                        "total_time_in_millis": 23029
                    }
                }]
            }
        }
    }
}

index stats

{
         "primaries": {
            "docs": {
               "count": 17890687,
               "deleted": 0
            },
            "store": {
               "size_in_bytes": 3416809219,
               "throttle_time_in_millis": 669
            },
            "indexing": {
               "index_total": 76407,
               "index_time_in_millis": 9773626,
               "index_current": 2,
               "delete_total": 0,
               "delete_time_in_millis": 0,
               "delete_current": 0,
               "noop_update_total": 0,
               "is_throttled": false,
               "throttle_time_in_millis": 0
            },
            "get": {
               "total": 0,
               "time_in_millis": 0,
               "exists_total": 0,
               "exists_time_in_millis": 0,
               "missing_total": 0,
               "missing_time_in_millis": 0,
               "current": 0
            },
            "search": {
               "open_contexts": 0,
               "query_total": 0,
               "query_time_in_millis": 0,
               "query_current": 0,
               "fetch_total": 0,
               "fetch_time_in_millis": 0,
               "fetch_current": 0
            },
            "merges": {
               "current": 0,
               "current_docs": 0,
               "current_size_in_bytes": 0,
               "total": 31,
               "total_time_in_millis": 29507,
               "total_docs": 281907,
               "total_size_in_bytes": 64514039
            },
            "refresh": {
               "total": 165,
               "total_time_in_millis": 16431
            },
            "flush": {
               "total": 3,
               "total_time_in_millis": 23029
            },
            "warmer": {
               "current": 0,
               "total": 368,
               "total_time_in_millis": 119
            },
            "filter_cache": {
               "memory_size_in_bytes": 0,
               "evictions": 0
            },
            "id_cache": {
               "memory_size_in_bytes": 0
            },
            "fielddata": {
               "memory_size_in_bytes": 0,
               "evictions": 0
            },
            "percolate": {
               "total": 0,
               "time_in_millis": 0,
               "current": 0,
               "memory_size_in_bytes": -1,
               "memory_size": "-1b",
               "queries": 0
            },
            "completion": {
               "size_in_bytes": 32864
            },
            "segments": {
               "count": 26,
               "memory_in_bytes": 9095212,
               "index_writer_memory_in_bytes": 324016,
               "index_writer_max_memory_in_bytes": 103887667,
               "version_map_memory_in_bytes": 22792,
               "fixed_bit_set_memory_in_bytes": 0
            },
            "translog": {
               "operations": 8246,
               "size_in_bytes": 4267399
            },
            "suggest": {
               "total": 0,
               "time_in_millis": 0,
               "current": 0
            },
            "query_cache": {
               "memory_size_in_bytes": 0,
               "evictions": 0,
               "hit_count": 0,
               "miss_count": 0
            }
         },
         "total": {
            "docs": {
               "count": 17890687,
               "deleted": 0
            },
            "store": {
               "size_in_bytes": 3416809219,
               "throttle_time_in_millis": 669
            },
            "indexing": {
               "index_total": 76407,
               "index_time_in_millis": 9773626,
               "index_current": 2,
               "delete_total": 0,
               "delete_time_in_millis": 0,
               "delete_current": 0,
               "noop_update_total": 0,
               "is_throttled": false,
               "throttle_time_in_millis": 0
            },
            "get": {
               "total": 0,
               "time_in_millis": 0,
               "exists_total": 0,
               "exists_time_in_millis": 0,
               "missing_total": 0,
               "missing_time_in_millis": 0,
               "current": 0
            },
            "search": {
               "open_contexts": 0,
               "query_total": 0,
               "query_time_in_millis": 0,
               "query_current": 0,
               "fetch_total": 0,
               "fetch_time_in_millis": 0,
               "fetch_current": 0
            },
            "merges": {
               "current": 0,
               "current_docs": 0,
               "current_size_in_bytes": 0,
               "total": 31,
               "total_time_in_millis": 29507,
               "total_docs": 281907,
               "total_size_in_bytes": 64514039
            },
            "refresh": {
               "total": 165,
               "total_time_in_millis": 16431
            },
            "flush": {
               "total": 3,
               "total_time_in_millis": 23029
            },
            "warmer": {
               "current": 0,
               "total": 368,
               "total_time_in_millis": 119
            },
            "filter_cache": {
               "memory_size_in_bytes": 0,
               "evictions": 0
            },
            "id_cache": {
               "memory_size_in_bytes": 0
            },
            "fielddata": {
               "memory_size_in_bytes": 0,
               "evictions": 0
            },
            "percolate": {
               "total": 0,
               "time_in_millis": 0,
               "current": 0,
               "memory_size_in_bytes": -1,
               "memory_size": "-1b",
               "queries": 0
            },
            "completion": {
               "size_in_bytes": 32864
            },
            "segments": {
               "count": 26,
               "memory_in_bytes": 9095212,
               "index_writer_memory_in_bytes": 324016,
               "index_writer_max_memory_in_bytes": 103887667,
               "version_map_memory_in_bytes": 22792,
               "fixed_bit_set_memory_in_bytes": 0
            },
            "translog": {
               "operations": 8246,
               "size_in_bytes": 4267399
            },
            "suggest": {
               "total": 0,
               "time_in_millis": 0,
               "current": 0
            },
            "query_cache": {
               "memory_size_in_bytes": 0,
               "evictions": 0,
               "hit_count": 0,
               "miss_count": 0
            }
         }
      }

*\ Is store.throttle_time_in_millis: 669 cosidered as an important factor? I am asking since I use doc_values on my mapping so maybe I am pushig too much my little VM :)

Finally I did notice some high I/O traffic with iotop

screenshot from 2015-02-23 22 48 22

Here is the sys info

Vagrant OS: CentOS release 6.6 RAM: 2GB CPU: Intel(R) Core(TM) i5-2430M CPU @ 2.40GHz (2 cores)

Thanks a lot for your time

Regards, Alex

The proxylab team | http://www.proxylab.io/

dadoonet commented 9 years ago

@clintongormley should we close this one as 2.0 does not support rivers ?

clintongormley commented 9 years ago

Yes. Closing