Closed zaziemo closed 3 years ago
Some links: https://logz.io/blog/upgrade-elasticsearch-5/
"Download the latest version package Install Java 8 Stop es instance Install the new package Put the new configuration Remove logging.yml (if you have) Remove unused site plugin Monitor the log Release the beast" https://www.linkedin.com/pulse/upgrading-elasticsearch-2x-5x-gian-giovani
Breaking changes: https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking-changes-5.0.html
Maybe this migration helper plugin can help as well: https://github.com/elastic/elasticsearch-migration/blob/master/README.asciidoc
Already did some work on this branch: https://github.com/rubymonsters/speakerinnen_liste/compare/mah-upgrade-elasticsearch
Steps towards upgrade elastic search locally:
More links, might be helpful https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking_50_suggester.html https://www.elastic.co/guide/en/elasticsearch/reference/5.0/search-suggesters-completion.html https://www.elastic.co/blog/strings-are-dead-long-live-strings https://asquera.de/blog/2018-01-24/elasticsearch-speakerinnen/ http://elasticsearch-cheatsheet.jolicode.com/ https://medium.com/@mourjo_sen/a-detailed-comparison-between-autocompletion-strategies-in-elasticsearch-66cb9e9c62c4 https://hackernoon.com/elasticsearch-using-completion-suggester-to-build-autocomplete-e9c120cf6d87 http://www.kadrmasconcepts.com/blog/2015/04/25/typeahead-js-elasticsearch-and-rails/
@netagonen Hi Neta, about step 4 (Upgrade elasticsearch server and API in the docker container): I think all you have to do is to specify a later version of the elasticsearch image in the docker-compose.yml. At the moment it is image: elasticsearch:2.4.5
.
Here you can find all available image versions: go to https://hub.docker.com/_/elasticsearch and then search under 'tags' for your preferred version.
hi @zaziemo , see also PR #1091 .
There is still more work to do. here is all what is done and what is left to do:
First of all it was very helpful to read this post in order to understand the searchable.rb
file
Upgrade:
make setup
make sure bundle install
is done successfullymake dev
, in the console run rake elasticsearch:import:all
make up
or docker-compose up
to see also logsProblems we encountered and how we handled them:
searchable.rb
correctly, then made sure they are called correctly in the profiles_controller.rb
.$ profiles = Profile.is_published.includes(:taggings, :translations).search("SEARCH_TERM", "", "", "").records
$ profiles.response.results.as_json
or profiles.response.response
"_score":10.668072
and "_score":0.37613404
. query_hash[:min_score] = 0.08 if Rails.env.production?
and on local, with the new scoring system (and after removing the environment condition), min_score
did not have any influence since all scores are above 0.08 anyways, so there where much more results.min_score
threashold for some reason? why do we want to filter out those results? are we sure they are not relevant? we can test this on the latest data we have. For now this line is removed. But if we want it, then we need to set the min_score
on prodution higher. 3.00 might be a good one, we need to test. de_bio
and en_bio
she get higher score than a speaker with bio only in one language, even though the search term has the same frequency.
The solution to this is to remove the tie_breaker
, because it make the score of someone with bio in both languages a higher score then for someone with bio in one language. see more heresearchable_spec.rb:98
.wait_for_elastic
script: when pushing to github, elastic did not start in travis, it also didn't start on Alon's machine, so we added this: environment: - discovery.type=single-node
to solve the problem (more info here) and also added the script to check if elastic starts before tests are running. volumes: - elastic_data:/usr/share/elasticsearch/data2
so version 6 could work. fixed by deleting the old volume on local machine with docker volume rm
query_hash
we call for a suggest:
option, so we would have suggestions for 0 matches (searchable.rb:57
) but we don't use it in the controller. It was decided to delete it for now. we should deal with a case of 0 matches in another scope.Questions and tasks left:
min_score
on production? if so, how much? I think it would be good to manually test without the min_score
and see if the last profiles are relevant or not.@profiles
after a search is empty.HashWrapper
on Hashie::Mash
to avoid printing these errors?
You are setting a key that conflicts with a built-in method Hashie::Mash#key defined in Hash. This can cause unexpected behavior when accessing the key as a property. You can still access the key via the #[] method.
@netagonen @alonpeer concerning min_score
I think we need one - or probably some more tweaking with the tie_breaker or field weighs. Can you explain how the profiles in the search results are ordered? Is it after the scores?
I made tests with the following search terms: 1) for „Social Business“, being in :en
Localhost: 1.108 speakers examples for non matching profiles (for my understanding): http://localhost:3000/en/profiles/nina-mohimi http://localhost:3000/en/profiles/arlene-buehler http://localhost:3000/en/profiles/alexandra-grassler
Production: 388 speakers (but there are also some profiles I would not consider matching to the search terms)
2) „programmieren frauen“, being in :en
localhost: 541 examples for non matching profiles (for my understanding):: http://localhost:3000/en/profiles/anja-blodow http://localhost:3000/en/profiles/regula-simon http://localhost:3000/en/profiles/annika-peters
not really matching but cool to list it: http://localhost:3000/en/profiles/janina-tiedemann --> which is an argument for being not so strict with the matching terms
production: 159
@netagonen @alonpeer Concerning autosuggestions: Here I don't get the same autosuggestions as on production which is good, because currently the suggester seems to be buggy if not broken :(
I would say it works as expected.
The suggested terms seem to be sorted by a score. Do you know what this score is about?
Solution for 0 Matches --> I opened a new issue (#1098). I think this is out of the scope for this one here.
@netagonen concerning cities: I think it is good to have the cities not analyzed. I remember faintly that we had problems with people having a city name as their real name. Besides as the cities serve as filters they are already covered.
@netagonen concerning the HashWrapper. Do I get it right that we have to add it somehow? If we have to, we should do it. This was one of the reasons we wanted to update elasticsearch in the first place (we were annoyed by these warnings.)
Hi @zaziemo. I just checked, and the Hashie::Mash#key
warnings appear on my localhost when I'm on the master branch, but no longer appear on the neta/upgrade-elastic-6.7
branch. 🎉
The country filtering does not work anymore.
To reproduce: 1) search for anything 2) click on one country in the most left column You get this view
--> I found the problem and will fix it. The country parameter has to be downcased http://localhost:3000/de/topics?filter_countries=FR&search=frauen+programmieren
to get rid of Hashie Mash warning and keep up with new development