opensearch-project / ansible-playbook

🤖 A community repository for Ansible Playbook of OpenSearch Project.
https://opensearch.org/
Apache License 2.0
86 stars 96 forks source link

Ansible playbook Initiative #2

Closed saravanan30erd closed 3 years ago

saravanan30erd commented 3 years ago

Refer: https://github.com/opensearch-project/opensearch-devops/pull/60

Description

Ansible playbook Initiative

Issues Resolved

Single node installation

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

peterzhuamazon commented 3 years ago

Hi @dandydeveloper @TheAlgo we have created the ansible playbook repo in OpenSearch Project and ask @saravanan30erd to migrate his PR here.

Please continue the review and if no issues, we can get this commit into main.

The old PR has been closed in devops repo. Reference: https://github.com/opensearch-project/opensearch-devops/pull/60

@hyandell Alreay approved the PR there, would love you to approve again here.

Thanks.

peterzhuamazon commented 3 years ago

I will test one more time today before merge it. Thanks.

peterzhuamazon commented 3 years ago

Have some issues the other day testing it on a centos7 locally. Will try remote again tomorrow once I have time. Sorry for the delay we are currently quite busy :smiley:

peterzhuamazon commented 3 years ago

Update: both @TheAlgo and I are busy with other projects, also 1.1.0 is coming up so our time is rather limited.

If any of the other contributors can take a try would greatly appreciated. @smlx @mprimeaux @dblock Let me know if you can help testing this and post some results. It would be very helpful as we dont want to merge if no proper testings are made.

Apologize for the inconvinience and thanks for understanding.

mprimeaux commented 3 years ago

@peterzhuamazon Yes, I am happy to help. I will review and comment.

peterzhuamazon commented 3 years ago

Hi @mprimeaux any progress on the testing? I am still working on 1.1.0 release so very limited time to test here. Would get more time after the new version is out.

Thanks.

mprimeaux commented 3 years ago

@peterzhuamazon I will look at this today. My apologies for the delay. We've been also working on new platform releases.

saravanan30erd commented 3 years ago

@mprimeaux any progress?

mprimeaux commented 3 years ago

LGTM

peterzhuamazon commented 3 years ago

Thanks @mprimeaux and I think I need some help to get the test working. I have ansible install on my macOS and try to deploy to an EC2 CentOS7.

$ ansible-playbook -i inventories/opensearch/hosts opensearch.yml --extra-vars "admin_password=admin kibanaserver_password=admin" --private-key centos7.pem
[WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details

PLAY [opensearch installation & configuration] *****************************************************************************************************************************************************************************************************************************************************************************

TASK [centos7/opensearch : hostname] ***************************************************************************************************************************************************************************************************************************************************************************************
fatal: [os1]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python"}, "changed": false, "msg": "Command failed rc=1, out=, err=Could not set property: Method call timed out\n"}

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************************************************
os1                        : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

The ssh connection is working but I have no idea why it keeps saying this error. I could not find out any resource online about this as well. The centos user I specified has full sudo no password permission.

If anyone can help me debug this would be great, I have yet to successfully deploy this ansible playbook.

@saravanan30erd Any ideas?

Thanks.

saravanan30erd commented 3 years ago

@peterzhuamazon I think it looks like issue with user, I tested this playbook on root user only. Could you try with root user and check?

peterzhuamazon commented 3 years ago

@peterzhuamazon I think it looks like issue with user, I tested this playbook on root user only. Could you try with root user and check?

I need to tweak the EC2 then. Will try tomorrow as today we are in busy release process. Thanks.

peterzhuamazon commented 3 years ago

Hi @saravanan30erd seems like there is some failure during deployment:

[2021-10-07T20:34:26,014][ERROR][o.o.b.Bootstrap          ] [os1] Exception
org.opensearch.transport.BindTransportException: Failed to bind to 34.209.175.146:[9300-9400]
        at org.opensearch.transport.TcpTransport.bindToPort(TcpTransport.java:429) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.TcpTransport.bindServer(TcpTransport.java:393) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.netty4.Netty4Transport.doStart(Netty4Transport.java:144) ~[?:?]
        at org.opensearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:72) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.TransportService.doStart(TransportService.java:251) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:72) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.node.Node.start(Node.java:823) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.Bootstrap.start(Bootstrap.java:330) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:415) [opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:182) [opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:173) [opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:99) [opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:140) [opensearch-cli-1.0.0.jar:1.0.0]
        at org.opensearch.cli.Command.main(Command.java:103) [opensearch-cli-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:139) [opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:105) [opensearch-1.0.0.jar:1.0.0]
Caused by: java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
        at sun.nio.ch.Net.bind(Net.java:550) ~[?:?]
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:249) ~[?:?]
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:550) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:506) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:491) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:973) ~[?:?]
        at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:248) ~[?:?]
        at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:356) ~[?:?]
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.lang.Thread.run(Thread.java:832) ~[?:?]
[2021-10-07T20:34:26,021][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [os1] uncaught exception in thread [main]
org.opensearch.bootstrap.StartupException: BindTransportException[Failed to bind to 34.209.175.146:[9300-9400]]; nested: BindException[Cannot assign requested address];
        at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:186) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:173) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:99) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:140) ~[opensearch-cli-1.0.0.jar:1.0.0]
        at org.opensearch.cli.Command.main(Command.java:103) ~[opensearch-cli-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:139) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:105) ~[opensearch-1.0.0.jar:1.0.0]
Caused by: org.opensearch.transport.BindTransportException: Failed to bind to 34.209.175.146:[9300-9400]
        at org.opensearch.transport.TcpTransport.bindToPort(TcpTransport.java:429) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.TcpTransport.bindServer(TcpTransport.java:393) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.netty4.Netty4Transport.doStart(Netty4Transport.java:144) ~[?:?]
        at org.opensearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:72) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.transport.TransportService.doStart(TransportService.java:251) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.common.component.AbstractLifecycleComponent.start(AbstractLifecycleComponent.java:72) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.node.Node.start(Node.java:823) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.Bootstrap.start(Bootstrap.java:330) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:415) ~[opensearch-1.0.0.jar:1.0.0]
        at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:182) ~[opensearch-1.0.0.jar:1.0.0]
        ... 6 more
Caused by: java.net.BindException: Cannot assign requested address
        at sun.nio.ch.Net.bind0(Native Method) ~[?:?]
        at sun.nio.ch.Net.bind(Net.java:550) ~[?:?]
        at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:249) ~[?:?]
        at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:134) ~[?:?]
        at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:550) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1334) ~[?:?]

TASK [centos7/opensearch : Security Plugin configuration | Initialize the opensearch security index in opensearch] *********************************************************************************************************************************************************************************************************
fatal: [os1]: FAILED! => {"changed": true, "cmd": "sh /usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh -cacert /usr/share/opensearch/config/root-ca.pem -cert /usr/share/opensearch/config/admin.pem -key /usr/share/opensearch/config/admin.key -f /usr/share/opensearch/plugins/opensearch-security/securityconfig/internal_users.yml -nhnv -icl -h 34.209.175.146\n", "delta": "0:02:07.557307", "end": "2021-10-07 20:36:37.799659", "msg": "non-zero return code", "rc": 255, "start": "2021-10-07 20:34:30.242352", "stderr": "", "stderr_lines": [], "stdout": "Security Admin v7\nWill connect to 34.209.175.146:9300\nERR: Seems there is no OpenSearch running on 34.209.175.146:9300 - Will exit", "stdout_lines": ["Security Admin v7", "Will connect to 34.209.175.146:9300", "ERR: Seems there is no OpenSearch running on 34.209.175.146:9300 - Will exit"]}

NO MORE HOSTS LEFT *********************************************************************************************************************************************************************************************************************************************************************************************************

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************************************************
os1                        : ok=26   changed=25   unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Could you help take a look on this?

Attach logs:

ansible-os.log

saravanan30erd commented 3 years ago

@peterzhuamazon

org.opensearch.transport.BindTransportException: Failed to bind to 34.209.175.146:[9300-9400]

Based on the above error, I think there is an issue with your inventories/opensearch/hosts file. In AWS EC2, services running inside EC2 machine doesn't aware about Elastic or Public IP address since only private IP address is attached in network interface. 34.209.175.146 is public IP address thats why opensearch fails to bind. so you should use private IP address for the opensearch configuration.

In inventories/opensearch/hosts file,

os1 ansible_host=10.0.1.1  ansible_user=root ip=10.0.1.1

ansible_host value is Elastic or public IP address which ansible uses to connect to EC2. ip value is private IP address in EC2 which should be used for Opensearch configuration.

In your case, you should change your hosts file as below

os1  ansible_host=34.209.175.146  ansible_user=root   ip=<EC2 Private IP>
peterzhuamazon commented 3 years ago

@peterzhuamazon

org.opensearch.transport.BindTransportException: Failed to bind to 34.209.175.146:[9300-9400]

Based on the above error, I think there is an issue with your inventories/opensearch/hosts file. In AWS EC2, services running inside EC2 machine doesn't aware about Elastic or Public IP address since only private IP address is attached in network interface. 34.209.175.146 is public IP address thats why opensearch fails to bind. so you should use private IP address for the opensearch configuration.

In inventories/opensearch/hosts file,

os1 ansible_host=10.0.1.1  ansible_user=root ip=10.0.1.1

ansible_host value is Elastic or public IP address which ansible uses to connect to EC2. ip value is private IP address in EC2 which should be used for Opensearch configuration.

In your case, you should change your hosts file as below

os1  ansible_host=34.209.175.146  ansible_user=root   ip=<EC2 Private IP>

Got it. Will try again soon. Sorry I am also new to Ansible, so this is a learning experience to me.

Thanks.

peterzhuamazon commented 3 years ago

The deploy success but seems like the password not setting correctly on OpenSearch:

$ curl https://<ip>:9200 -u admin:admin --insecure
Unauthorized
saravanan30erd commented 3 years ago

@peterzhuamazon If ansible deployment is completed without any issues mean it should work.

https://github.com/opensearch-project/ansible-playbook/pull/2/files#diff-16ee7588e8089d4e5fea3e6381b2886175fceb96349d2960d0089a4865129836R36

Bcoz I am checking status of the opensearch cluster as the last step to confirm everything works fine.

Could you provide the output of last two tasks??

- name: Check the opensearch status
  command: curl https://{{ inventory_hostname }}:9200/_cluster/health?pretty -u 'admin:{{ admin_password }}' -k
  register: os_status

- name: Show the opensearch status
  debug:
    msg: "{{ os_status.stdout }}"
peterzhuamazon commented 3 years ago

@peterzhuamazon If ansible deployment is completed without any issues mean it should work.

https://github.com/opensearch-project/ansible-playbook/pull/2/files#diff-16ee7588e8089d4e5fea3e6381b2886175fceb96349d2960d0089a4865129836R36

Bcoz I am checking status of the opensearch cluster as the last step to confirm everything works fine.

Could you provide the output of last two tasks??

- name: Check the opensearch status
  command: curl https://{{ inventory_hostname }}:9200/_cluster/health?pretty -u 'admin:{{ admin_password }}' -k
  register: os_status

- name: Show the opensearch status
  debug:
    msg: "{{ os_status.stdout }}"
TASK [centos7/opensearch : Check the opensearch status] ********************************************************************************************************************************************************************************************************************************************************************
changed: [os1]

TASK [centos7/opensearch : Show the opensearch status] *********************************************************************************************************************************************************************************************************************************************************************
ok: [os1] => {
    "msg": "Unauthorized"
}

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************************************************
os1                        : ok=30   changed=18   unreachable=0    failed=0    skipped=3    rescued=0    ignored=0
saravanan30erd commented 3 years ago

@peterzhuamazon It works fine for all the passwords (bcoz mostly its unique) but only admin. I am doing general search and replace using sed for replacing text passwords with its encoded version, this is working fine for all passwords. But for admin password, it replaced some other keywords also named as admin in configuration file thats why we are facing this error.

admin:
  hash: "{{ admin_password }}"
  reserved: true
  backend_roles:
  - "admin"
  description: "admin user"

Now I have fixed the search & replace method by using condition which will exactly match and replace only the password text not any other texts in configuration file.

If it works, you will see the below output.

TASK [centos7/opensearch : Check the opensearch status] ************************************************************
changed: [os1]

TASK [centos7/opensearch : Show the opensearch status] *************************************************************
ok: [os1] => {
    "msg": {
        "active_primary_shards": 2,
        "active_shards": 2,
        "active_shards_percent_as_number": 66.66666666666666,
        "cluster_name": "development-cluster",
        "delayed_unassigned_shards": 0,
        "discovered_master": true,
        "initializing_shards": 0,
        "number_of_data_nodes": 1,
        "number_of_in_flight_fetch": 0,
        "number_of_nodes": 1,
        "number_of_pending_tasks": 0,
        "relocating_shards": 0,
        "status": "yellow",
        "task_max_waiting_in_queue_millis": 0,
        "timed_out": false,
        "unassigned_shards": 1
    }
}
peterzhuamazon commented 3 years ago

{
  "name" : "os1",
  "cluster_name" : "development-cluster",
  "cluster_uuid" : "Gp1jaYZEQQm65LZ1U4DGbA",
  "version" : {
    "distribution" : "opensearch",
    "number" : "1.0.0",
    "build_type" : "tar",
    "build_hash" : "34550c5b17124ddc59458ef774f6b43a086522e3",
    "build_date" : "2021-07-02T23:22:21.383695Z",
    "build_snapshot" : false,
    "lucene_version" : "8.8.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}
    "name": "dashboards1",
    "status": {
        "overall": {
            "icon": "success",
            "nickname": "Looking good",
            "since": "2021-10-13T17:39:25.439Z",
            "state": "green",
            "title": "Green",
            "uiColor": "secondary"
        }

Thanks @saravanan30erd

Working pretty well. Just one more question is there any documentations on sed '/hash: / s'? I am trying to find out the reference but could not.

Thanks.

saravanan30erd commented 3 years ago

@peterzhuamazon

Just one more question is there any documentations on `sed '/hash: / s'`? I am trying to find out the reference but could not.

https://unix.stackexchange.com/questions/14092/search-and-replace-in-multiple-files-based-on-condition

In sed, you can put a regexp (between /…/) before the s command to only perform the replacement on lines containing that regexp.

peterzhuamazon commented 3 years ago

@peterzhuamazon

Just one more question is there any documentations on `sed '/hash: / s'`? I am trying to find out the reference but could not.

https://unix.stackexchange.com/questions/14092/search-and-replace-in-multiple-files-based-on-condition

In sed, you can put a regexp (between /…/) before the s command to only perform the replacement on lines containing that regexp.

Ah, didnt realize it is a regex to replace the line hash:. Thanks. Will merge this PR very soon. Please remember that you need to add a reference in devops repo https://github.com/opensearch-project/opensearch-devops/issues/71

Thanks very much.

saravanan30erd commented 3 years ago

@peterzhuamazon Sure, will add pointer soon.