algolia / algoliasearch-jekyll

⚠ DEPRECATED Use jekyll-algolia instead.
https://community.algolia.com/jekyll-algolia/
MIT License
125 stars 12 forks source link

Exclude div classes from index #22

Closed iloveip closed 8 years ago

iloveip commented 8 years ago

Hello,

I have a question. Is there any way to exclude certain html elements from index? For example, certain div and p classes?

Also, how can I limit the number of words in search results?

Thank you very much in advance!

pixelastic commented 8 years ago

Hello,

You can configure the record_css_selector to match only the elements you would like to index. For example if you only want the div with class myClass, you would put:

algolia:
  record_css_selector: 'div.myClass`

Now if it's easier to exclude elements that to whitelist them, you can use the custom_hook_each method that takes the item and node as input. node is a Nokogiri node, so you can simply check if it matches the classes you want to exclude and return nil from the method. Returning nil will prevent the element from being indexed.

Regarding the limit of words in the search results, you have several way to do that:

Hope that helps

iloveip commented 8 years ago

Hi @pixelastic,

Thank you very much for your reply. I've done the second part (added attributesToSnippet: 20; to _config.yml file).

Do you have any code examples I could look at for custom_hook_each method? I would like to exclude one p with a class and one div with a class.

pixelastic commented 8 years ago

I don't have an example, but here is how I would do it:

First, create a ./_plugins/search.rb file in your Jekyll root directory. Any ruby file in ./_plugins will be loaded by Jekyll. It doesn't matter if you name if search.rb or anything else, though.

In that file, we'll override the custom_hook_each of the plugin, like this:

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    # To get the node name (p, or div, etc)
    puts node.name
   # To get the value of the class. Will be nil if no class defined
    puts node.attr('class')

    # You then just have to return item if you want to keep the element, or nil if you want to discard it
  end
end

I will add this example to the documentation, to make it clearer for the next user :)

iloveip commented 8 years ago

Hi @pixelastic,

Thank you very much for your explanation and reply! Would something like this be correct?

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    puts node.attr('related-article')
    nil
  end

  def custom_hook_each(item, node)
    puts node.attr('warning')
    nil
  end
end
iloveip commented 8 years ago

Hi @pixelastic,

In addition to my last question is it ok to put just:

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    puts node.attr('related-article')
    nil
  end
end

Or do I have to put both, node.name and node.attr like this:

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    puts node.div
    puts node.attr('related-article')
    nil
  end
end

Thank you very much in advance!

pixelastic commented 8 years ago

Actually, puts only displays the variable. I just used that as an example. To do what you have in mind, I think this should work

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    if node.name == 'div' && node.attr('class').include?('related-article')
      return nil
    end
    item
  end
end

This would exclude all div that have the class related-article. Depending on what you exclude, you'll have to adapt it. You can add several if in the method, and return nil when you want to exclude the element.

iloveip commented 8 years ago

@pixelastic thank you very much! I tried the code above, but I get an undefined method error:

jekyll 3.0.0.pre.rc1 | Error:  undefined method `include?' for nil:NilClass
pixelastic commented 8 years ago
class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    class_name = node.attr('class')
    if node.name == 'div' && (!class_name.nil? && class_name.include?('related-article'))
      return nil
    end
    item
  end
end

This might work better. Note that this is more related to ruby than to the plugin itself.

iloveip commented 8 years ago

@pixelastic thank you very much for your help and I'm sorry for the trouble.

I edited the code as follows and it's working now:

class AlgoliaSearchRecordExtractor
  def custom_hook_each(item, node)
    if node.name == 'div' && node.attr('class') == 'related-article'
      return nil
    end
    item
  end
end
pixelastic commented 8 years ago

No problem, glad to see it works :)