imotov / elasticsearch-facet-script

Fully Scriptable Facets for ElasticSearch
Apache License 2.0
49 stars 16 forks source link

h1. Script Facet Plugin for ElasticSearch

h2. Introduction

p. The script facet plugin provides fully scriptable facets for elasticsearch.

h2. Compatibility

|. Script Facet Plugin |. Elasticsearch | | 1.1.2 | 0.90.3 -> master | | 1.1.1 | 0.90.2 | | 1.1.0 | 0.90.0.Beta1 -> 0.90.1 | | 1.0.1 | 0.19.10 -> 0.20.99 | | 1.0.0 | 0.19.10 -> 0.20.99 |

h2. Usage

p. In order to install the plugin, simply run the following command in the elasticsearch home directory:

bin/plugin -install facet-script -url http://dl.bintray.com/content/imotov/elasticsearch-plugins/elasticsearch-facet-script-VERSION.zip

p. where @VERSION@ is the version of the plugin from the compatibility table. For examples to install version @1.1.2@ run

bin/plugin -install facet-script -url http://dl.bintray.com/content/imotov/elasticsearch-plugins/elasticsearch-facet-script-1.1.2.zip

p. The script facet plugin can be used for quick custom facet prototyping. The script facet is using three script to initialize, collect and aggregate the facets. A typical script facet request looks like this:

"facets": {
    "facet1": {
        "script": {
            "init_script" : "my_init",
            "map_script": "my_map",
            "combine_script": "my_combine",
            "reduce_script" : "my_reduce",
            "params" : {
                "facet" : [],
                "param1" : "value 1"
            }
            "reduce_params" : {
                "reduce_param1" : "value 1"
            }
        }
    }
}

p. A script facet execution can be represented using the following pseudocode:

facets = [];
foreach(shard in shards) {
    init_script(); // Executed once per shard
    foreach(record in search_results(shard)) {
        // Init _field and doc lookup from the record
        map_script(); // Executed once per record
    }
    facets.add(combine_script()); // Executed once per shard after all records are processed
}
reduce_script(facets); // Executed once per facet request

p. The @init_script@, @map_script@ and @combine_script@ scripts are executed on the nodes where shards are allocated. The @reduce_script@ is executed on the node that received the client's request.

p. The @init_script@, @map_script@ and @combine_script@ scripts can access parameters specified in the @params@ field of the request. These scripts can also use node client using @_client@ variable and search context using @_ctx@ variable. The @map_script@ can access the current record using standard "document, field and source lookup mechanism":http://www.elasticsearch.org/guide/reference/modules/scripting.html.

p. The content of the @params@ field is initialized with values specified in the facet requests at the beginning of a shard processing and then preserved between all script calls within the shard. The @init_script@ can be used to do additional initialization of the @params@ map, @map_script@ can use the @params@ map to accumulate results of processing and @combine_script@ can retrieve accumulated results from the @params@ map and combine them into intermediate facet for the processed shard. The return values of @combine_script@ calls for all shards are sent to the node where @reduce_script@ is running and accumulated into an array list that is passed to the @reduce_script@ script as a @facets@ parameter. The return value of the @reduce_script@ are returned to the users as a result of the facet query. It's important to note that return values of the @combine_script@ and @reduce_script@ scripts have to be JSON serializable, which means they can contain only primitive data types, @java.util.Date@, @byte[]@, @Object[]@, @java.util.List@, and @java.util.Map@.

p. @reduce_script@ can be initialized by the @reduce_params@ object.

p. The only mandatory parameter of the script facet is @map_script@. By default, the @init_script@ doesn't do anything, the @combine_script@ returns the variable named @facet@ and @reduce_script@ simply returns the array of the facets that it received from the shards.

h2. Examples

p. The following request calculates letter frequencies for the letters 'A'-'Z' in the field @message@.