brewster / elastictastic

Object-document mapper and lightweight API adapter for ElasticSearch
MIT License
88 stars 13 forks source link

Doubt: Where to set the analyzer? #25

Open mrcasals opened 11 years ago

mrcasals commented 11 years ago

Hi,

I'm moving from Tire to Elastictastic to work with an elasticsearch database and I'm having problems setting the analyzer I want to use. With Tire, I had to use the following:

@index.create :settings => {
  :index => {
    :analysis => {
      :analyzer => {
        :default => {
          type: "snowball",
          :language => 'Catalan'
        }
      }
    }
  }
}

I've tried to set it on the mapping, but it doesn't work:

class Product
  include Elastictastic::Document

  field :name, analyzer: {type: "snowball", language: "Catalan"}
  field :description, analyzer: {type: "snowball", language: "Catalan"}

end

I tried to set a default analyzer, but I could not find any reference to this on the README nor the code docs. I really need to set this analyzer, any idea on how to achive that?

Thank you!

outoftime commented 11 years ago

Hi,

The code snippet that sets the analyzer on the individual fields is correct. Did you run Product.sync_mapping? This is required to have your configuration in Elastictastic propagated to the index in ElasticSearch.

Setting a default analyzer is not currently supported, but it should be. Any interest in submitting a patch? If not, I should be able to get to it at some point.

Mat

mrcasals commented 11 years ago

Hi @outoftime, thanks for the quick reply!

On my test file, I save some products and then I do Product.sync_mapping. Rspec complains, though...

Failure/Error: Product.sync_mapping
   Elastictastic::ServerError::MapperParsingException:
   [Analyzer [{type=snowball, language=Catalan}] not found for field [name]]

I get the same result if I first sync mappings and then save the records.

outoftime commented 11 years ago

I see. My apologies; I didn't fully absorb your initial question. So what you are really trying to do is define analyzer settings on your index -- this would be required whether you were setting up the default analyzer or defining a custom one that you could use for certain fields.

I'm sorry to say that as of now, Elastictastic doesn't provide any abstraction on top of the index settings API. You can use the connection object to make an HTTP request directly (this is essentially just a REST client). So something like this:

Elastictastic.client.connection.put("/#{Elastictastic.config.default_index}", index: {analysis: {analyzer: {default: {type: "snowball", language: "Catalan"}}}})

If your index settings get any more complex, I'd suggest extracting this out into a YAML file and then loading it and sending its contents in that request instead of hard-coding it. (This is what the app I work on does, although we also use templates, which adds a bit of additional complexity).

Obviously the above is not very user-friendly and it's something Elastictastic should support in the future. Let's leave this ticket open to generally capture abstraction of index settings. Of course, patch submissions are always welcome : )

outoftime commented 11 years ago

Oh, I should also mention -- you'll still want to do a Post.sync_mapping before indexing any documents.