IsraelHikingMap / Site

Israel Hiking Map has maps, route planning, and travel information for Israel. This repository holds the files needed for running the Israel Hiking Map site and apps.
https://israelhiking.osm.org.il/
Other
81 stars 33 forks source link

Upgrade to Elasticsearch 7.x or later #1258

Closed HarelM closed 4 years ago

HarelM commented 4 years ago

Infra

Elastic 5.6 is not longer supported or maintained. It is important to migrate out of this version. There's a need to migrate to 6.8 first in order to keep old indices valid and only then migrate to 7.8. There's the issue with this which needs to be taken into consideration when upgrading: https://github.com/elastic/elasticsearch-net/issues/4791 And also there's a chance to move ES to docker.

HarelM commented 4 years ago

The following is the upgrade script I wrote to do all the relevant changes in the database:

Manual steps:

  1. Migrate the server to use docker with 5.6 - this is the most tricky part I believe
  2. Update the docker to version 6.8.12 of elastic
  3. Run the 6.8 part of the script to reindex and remove unnecessary indices
  4. Upgrade the docker to version 7.9
  5. Run the 7.9 part of this script
  6. ...?
# Upgrade to 6.8

Invoke-WebRequest -Method DELETE -Uri "http://localhost:9200/.monitoring-*"
Invoke-WebRequest -Method DELETE -Uri "http://localhost:9200/.watcher-*"
Invoke-WebRequest -Method DELETE -Uri "http://localhost:9200/.watches"
Invoke-WebRequest -Method DELETE -Uri "http://localhost:9200/.triggered_watches"

$indices =  "external_pois", "osm_highways2", "rebuild_log", "images", "osm_names1", "custom_user_layers", "shares"
Foreach ($index in $indices)
{
    $str = '{ "source": { "index": "' + $index + '" }, "dest": { "index": "' + $index + '-6.8" } }'
    echo "Starting running reindex on $index"
    Invoke-WebRequest -Uri "http://localhost:9200/_reindex?pretty" -Method POST -Headers @{"Content-Type"="application/json"} -Body $str
    echo "Finished running reindex on $index"
}

#reindex issue:

Invoke-WebRequest -Uri "http://localhost:9200/osm_names1-6.8/_settings" -Method "PUT" -Body '{ "index.mapping.total_fields.limit": 10000 }' -ContentType "application/json"

#After redindex finished successfuly: 

$indices =  "external_pois", "osm_highways2", "rebuild_log", "images", "osm_names1", "custom_user_layers", "shares"
Foreach ($index in $indices)
{
    echo "Deleting $index"
    Invoke-WebRequest -Uri "http://localhost:9200/$Index" -Method DELETE
}

# Upgrade to 7.9
$indices =  "external_pois", "osm_highways2", "rebuild_log", "images", "osm_names1", "custom_user_layers", "shares"
Foreach ($index in $indices)
{
    $str = '{ "source": { "index": "' + $index + '-6.8" }, "dest": { "index": "' + $index + '" } }'
    echo "Starting running reindex on $index"
    Invoke-WebRequest -Uri "http://localhost:9200/_reindex?pretty" -Method POST -Headers @{"Content-Type"="application/json"} -Body $str
    echo "Finished running reindex on $index"
}
HarelM commented 4 years ago

There's still a need to replicate the data. This is an interesting solution to this issue: https://discuss.elastic.co/t/es-2-nodes-in-docker-setup-connection/99189 I'll try it out locally and see if it does the job...

HarelM commented 4 years ago

Steps that I have made so far:

  1. Load a 6.8 node and attach it to 5.6 node.
  2. Let them sync and get to green state
  3. Reindex all indices (which failed multiple times... :-((), It might have been better if I was to detach the new node after the sync, maybe...
  4. Disable rebuild of search and POI data for now

Steps left:

  1. detach 6.8 node
  2. Load 7.9 node with the 6.8 node data
  3. Reindex all indices again
  4. Delete unwanted indices
  5. Copy shares that were created after this process started...
zstadler commented 4 years ago

Remaining steps no. 5 to also include updates done to pre-existing shares.

HarelM commented 4 years ago

Current status - there are two nodes and I can't go back to the previous state of only one node. This is just a nightmare... Wrote a post here: https://discuss.elastic.co/t/rolling-upgrade-cant-revert/249306 I hope someone can help me...

HarelM commented 4 years ago

Seems like there's no easy way. I'll use elasticdump to export the data from the database. shutdown the site for a few hours and restore it to a newer version...

zstadler commented 4 years ago

@HarelM

Current status - there are two nodes and I can't go back to the previous state of only one node. This is just a nightmare... Wrote a post here: https://discuss.elastic.co/t/rolling-upgrade-cant-revert/249306 I hope someone can help me...

Quoting Add and remove nodes in your cluster

If there are only two master-eligible nodes remaining then neither node can be safely removed since both are required to reliably make progress. To remove one of these nodes you must first inform Elasticsearch that it should not be part of the voting configuration, and that the voting power should instead be given to the other node. You can then take the excluded node offline without preventing the other node from making progress.

HarelM commented 4 years ago

Done and published.

HarelM commented 4 years ago

1247 was introduced again as I was optimistic it was solved in the new version...

HarelM commented 4 years ago

Issue has been fixed with the work around I did. If this https://github.com/NetTopologySuite/NetTopologySuite.IO.GeoJSON/pull/59 gets merged and a new package will be released I might be able to remove the workaround I did.

russcam commented 4 years ago

@HarelM Not a comment on the specific actions taken in the PowerShell script, but one thing worth highlighting is that none of the requests check the response to ascertain that it "completed successfully" i.e. checking not only the HTTP response status code, but also the response body to assert that the operation is acknowledged or that there were no failures. At the moment, it looks like it is possible for a request to not "complete successfully", but for the script to continue anyway.

HarelM commented 4 years ago

You are right of course, but I ended up not fully using the scripts as the upgrade process flipped on me (see my two comments before the first time I closed this issue). I started using the scripts and watched the console for output to see if and what went wrong as this was a one time thing and I monitored it carefully...