chef / chef-server

Chef Infra Server is a hub for configuration data; storing cookbooks, node policies and metadata of managed nodes.
https://www.chef.io/chef/
Apache License 2.0
289 stars 210 forks source link

Chef server backup on chef-backend setup should skip ES back up #2056

Closed PrajaktaPurohit closed 3 years ago

PrajaktaPurohit commented 4 years ago

On HA backend setup while taking chef-backend-ctl backup it collects ES backup by stopping the service. However this backup is not useful as we need to reindex for a new ES data hence it should skip taking ES backup

It should be skipping ES backup

It is collecting ES back up as well even though it is not useful

PrajaktaPurohit commented 4 years ago

https://github.com/chef/customer-bugs/issues/85

kapilchouhan99 commented 4 years ago

Hi @PrajaktaPurohit,

I have hit the backup using chef-server-ctl backup. it contains files:

root@fe1:/var/opt/chef-backup/var/opt/opscode# ls
bookshelf  opscode-solr4  postgresql  rabbitmq  redis_lb  upgrades

so no visiable way to locate elasticserach backup file.

Also when I am running chef-server-ctl reconfigure on FE1 or FE2, its showing elasticsearch is disable

Recipe: private-chef::default
  * component_runit_service[elasticsearch] action disable
  Recipe: <Dynamically Defined Resource>
    * service[elasticsearch] action nothing (skipped due to action :nothing)
    * runit_service[elasticsearch] action disable
      * ruby_block[disable elasticsearch] action run (skipped due to only_if)
       (up to date)
     (up to date)

seems elasticsearch is up and running while checking the chef-backend-ctl status.

Service        Local Status        Time in State  Distributed Node Status                     
leaderl        running (pid 9326)  7d 1h 41m 29s  leader: 1; waiting: 0; follower: 2; total: 3
epmd           running (pid 9151)  7d 1h 41m 49s  status: local-only                          
etcd           running (pid 9014)  7d 1h 42m 1s   health: green; healthy nodes: 3/3           
postgresql     running (pid 9392)  7d 1h 41m 29s  leader: 1; offline: 0; syncing: 0; synced: 2
elasticsearch  running (pid 9217)  7d 1h 41m 45s  state: green; nodes online: 3/3             

System  Local Status                                          Distributed Node Status          
disks   /var/log/chef-backend: OK; /var/opt/chef-backend: OK  health: green; healthy nodes: 3/3

also I have hit the backup using chef-backend-ctl backup on BE2. it contains files:

root@backend-2:~/var/opt/chef-backend# ls
elasticsearch  postgresql

Could you please help me to locate elasticsearch index data files? or any reference or steps to replicate the same? Thanks!!!

PrajaktaPurohit commented 4 years ago

@kapilchouhan99 in a chef-backend setup elasticsearch would be disabled on the FE's. So the first part looks correct.

Backups would be run on a follower node of a backend setup. The following steps might come hand to replicate the issue:

1. Identify the backend-follower:
root@ip-10-0-24-129:~# chef-backend-ctl cluster-status 
Name            IP           GUID                              Role      PG        ES          Blocked      Eligible
ip-10-0-24-129  10.0.24.129  8401556c984fe084c9e4a00145f8dd20  follower  follower  not_master  not_blocked  true
ip-10-0-24-213  10.0.24.213  a86a93457ade7a13969f3d30dfcf05e7  follower  follower  master      not_blocked  true
ip-10-0-31-115  10.0.31.115  b709eb67c50a05800257bb4ee8f9fc84  leader    leader    not_master  not_blocked  true

2. Run command to create backup on the BE follower
root@ip-10-0-24-129:~# chef-backend-ctl backup
 ✓ Taking inventory of running services
 ✓ Exporting PostgreSQL Data
 ✓ Shutting down elasticsearch
 ✓ Shutting down elasticsearchakend-ctl backup
 ✓ Taking inventory of running services
 ✓ Exporting PostgreSQL Data
 ✓ Shutting down elasticsearch
 ✓ Exporting Elasticsearch Data
 ✓ Starting up elasticsearch
 ✓ Exporting tarball
 ✓ Cleaning up
Backup at /var/opt/chef-backup/chef-backup-2020-08-12-11-54-50.tgz
Backup Complete!
root@ip-10-0-24-129:~# ls /var/opt/chef-backup/chef-backup-2020-08-12-11-54-50.tgz
/var/opt/chef-backup/chef-backup-2020-08-12-11-54-50.tgz

3. The backup is also backing up Elasticsearch.
root@ip-10-0-24-129:~# tar -zxvf /var/opt/chef-backup/chef-backup-2020-08-12-11-54-50.tgz -C /var/opt/chef-backup/
var/opt/chef-backend/elasticsearch/data/
var/opt/chef-backend/elasticsearch/data/nodes/
var/opt/chef-backend/elasticsearch/data/nodes/0/
var/opt/chef-backend/elasticsearch/data/nodes/0/indices/
var/opt/chef-backend/elasticsearch/data/nodes/0/indices/Psogb4DCTM-DaMZhw9C9Qg/
:
:
:
root@ip-10-0-24-129:~# ls /var/opt/chef-backup/
chef_backup-2020-08-12-11-54-50.sql  chef-backup-2020-08-12-11-54-50.tgz  etc  manifest.json  var
root@ip-10-0-24-129:~# ls /var/opt/chef-backup/var/opt/chef-backend/
elasticsearch/ postgresql/
root@ip-10-0-24-129:~# ls /var/opt/chef-backup/var/opt/chef-backend/elasticsearch/data/nodes/0/
indices  node.lock  _state

This might be useful in tracing where that code lives:

https://github.com/chef/chef-backend/blob/0d54120a9eeaff7343f58428a7fcea2bc957de77/libcb/lib/libcb/command/backup.rb#L67-L99