enhance-manageiq / project-setup

Track setup issues for the project environment
2 stars 0 forks source link

Generic Service Error #5

Open son-vyas opened 6 years ago

son-vyas commented 6 years ago

The error _Generic Service Error: Server [EVM] Service [OpenStack | Create External Network] Provision Step [check_completed] Status [Error Processing checkcompleted] is thrown when the service is ordered. The Approval State shows "Approved" but Status "failed".

The automation.log is as follows :-

[----] I, [2018-02-02T01:25:38.429372 #1704:749140]  INFO -- : Instance [/ManageIQ/System/Event/EmsEvent/EMBEDDEDANSIBLE/job_create] not found in MiqA
eDatastore - trying [.missing]
[----] E, [2018-02-02T01:26:25.019150 #1704:7320e4] ERROR -- : <AEMethod check_completed> Error in check completed: Job launching failed
[----] I, [2018-02-02T01:26:25.072668 #1704:7320e4]  INFO -- : <AEMethod check_completed> Ending check_completed
[----] I, [2018-02-02T01:26:25.087819 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) <AEMethod [/ManageIQ/Service/Generic/StateMachines/GenericLifecycle/check_completed]> Ending
[----] I, [2018-02-02T01:26:25.087919 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) Method exited with rc=MIQ_OK
[----] I, [2018-02-02T01:26:25.088159 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) Processed  State=[check_completed] with Result=[error]
[----] W, [2018-02-02T01:26:25.088235 #1704:749140]  WARN -- : Q-task_id([service_template_provision_task_6]) Error in State=[check_completed]
[----] I, [2018-02-02T01:26:25.088333 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) In State=[check_completed], invoking [on_error] method=[update_status(status => 'Error Processing check_completed')]
[----] I, [2018-02-02T01:26:25.140437 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) Updated namespace [Service/Generic/StateMachines/GenericLifecycle/update_status  ManageIQ/Service/Generic/StateMachines]
[----] I, [2018-02-02T01:26:25.147570 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) Invoking [inline] method [/ManageIQ/Service/Generic/StateMachines/GenericLifecycle/update_status] with inputs [{"status"=>"Error Processing check_completed"}]
[----] I, [2018-02-02T01:26:25.148587 #1704:749140]  INFO -- : Q-task_id([service_template_provision_task_6]) <AEMethod [/ManageIQ/Service/Generic/StateMachines/GenericLifecycle/update_status]> Starting
[----] I, [2018-02-02T01:26:25.445711 #1704:73b9dc]  INFO -- : <AEMethod update_status> Starting update_status
[----] I, [2018-02-02T01:26:25.451291 #1704:73b9dc]  INFO -- : <AEMethod update_status> Status message: Server [EVM] Service [OpenStack | Create External Network] Provision Step [check_completed] Status [Error Processing check_completed]
[----] E, [2018-02-02T01:26:25.610041 #1704:73b9dc] ERROR -- : <AEMethod update_status> Generic Service Error: Server [EVM] Service [OpenStack | Create External Network] Provision Step [check_completed] Status [Error Processing check_completed]

15663817

Probable Output:- The task in the playbook should run successfully.

akshay196 commented 6 years ago

Error in evm.log:

[----] E, [2018-02-05T14:45:55.390745 #1622:56b134] ERROR -- : Q-task_id([service_template_provision_task_7]) MIQ(MiqAeEngine.deliver) Error delivering {"dialog_param_network_name"=>"MyExtNetwork", "dialog_param_os_host"=>"172.22.26.201", "dialog_param_os_user"=>"admin", "password::dialog_param_os_password"=>"v2:{k5ct6MYnlnbGM2hz6fqCLlljLVOqfJfIavTULMybGEw=}", "dialog_param_os_ssl"=>"false", "request"=>"clone_to_service", :service_action=>"Provision", "Service::Service"=>7} for object [ServiceTemplateProvisionTask.7] with state [check_completed] to Automate:
psachin commented 6 years ago

Can you check if the network is created? I think it should be created. It is just the ansible process not able to successfully deliver job status to evm.

akshay196 commented 6 years ago

@psachin Nope. Network is not created in OpenStack Host.

psachin commented 6 years ago

Oh I missed the line in error. The job launched failed,

[----] E, [2018-02-02T01:26:25.019150 #1704:7320e4] ERROR -- : <AEMethod check_completed> Error in check completed: Job launching failed

This means that evm never handed over job to Ansible. There has to be something in evm.log, automation.log, or in the path /var/log/tower/, /var/log/supervisor/, /var/lib/awx/job_status/

akshay196 commented 6 years ago

Right. Job is not handed over to Ansible. That's why I don't see anything in Services > My Services > Selected Services > Provisioning > Standard Output . I come to know that, there is no directory as /var/lib/awx/ nor /var/log/tower/, /var/log/supervisor/ in my system, but Ansible automation is up and running.

akshay196 commented 6 years ago

I have checked awx log using docker logs -f awx_task and docker logs -f awx_web where I get Connection reset by peer error. It is discussed on awx github issue

Some issues regarding /var/lib/awx/projects folder: https://github.com/ansible/awx/issues/857 & https://github.com/ansible/awx/pull/1080

psachin commented 6 years ago

I encounter similar issue in my environment. I noticed Ansible role fails to enable in my environment. Raised my concern here https://github.com/ManageIQ/manageiq/issues/15670

psachin commented 6 years ago

@son-vyas @akshay196

I can see that those paths(/var/log/tower/, /var/log/supervisor/, /var/lib/awx/job_status/) are present in containers awx_task & aws_web

psachin commented 6 years ago

@son-vyas @akshay196

Also make sure you use create_instance branch of https://github.com/psachin/openstack-ansible-inside and NOT master

akshay196 commented 6 years ago

docker logs -f awx_task throws this error many times,

[2018-02-18 10:30:28,541: ERROR/MainProcess] Control command error: error(104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/celery/worker/pidbox.py", line 42, in on_message
    self.node.handle_message(body, message)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 129, in handle_message
    return self.dispatch(**body)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 112, in dispatch
    ticket=ticket)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 135, in reply
    serializer=self.mailbox.serializer)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/pidbox.py", line 265, in _publish_reply
    **opts
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/kombu/messaging.py", line 203, in _publish
    mandatory=mandatory, immediate=immediate,
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/channel.py", line 1734, in _basic_publish
    (0, exchange, routing_key, mandatory, immediate), msg
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/abstract_channel.py", line 50, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/method_framing.py", line 166, in write_frame
    write(view[:offset])
  File "/var/lib/awx/venv/awx/lib/python2.7/site-packages/amqp/transport.py", line 258, in write
    self._write(s)
  File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 104] Connection reset by peer

There are lot similar issues filed on ansible/awx, https://github.com/ansible/awx/issues?utf8=%E2%9C%93&q=is%3Aissue+connection+reset+by+peer And developers are working on that in awx:devel branch.

I guess, job launching is failed because awx has api component issue. I got directory /var/lib/awx/job_status/ and /var/log/tower/ inside container; but it is empty. As job is not delivered to awx, it is not scheduled.

@psachin Yes. we have added your repository using create_instance branch. Just to check, we have tried another Ansible_playbook repo: https://github.com/jonnyfiveiq/Ansible_Playbooks . It has playbooks to create and delete user in MIQ. But it also shows Generic Service error.

Are you able to launch ansible playbook service in your environment(Gaprindishvili) with no error?

psachin commented 6 years ago

https://github.com/ManageIQ/manageiq/issues/17014

akshay196 commented 6 years ago

rabbitmq Docker image is present.

[root@localhost vmdb]# docker pull rabbitmq:3
Trying to pull repository docker.io/library/rabbitmq ... 
3: Pulling from docker.io/library/rabbitmq

Digest: sha256:eebd656315be098c7836719745ccfec685b18bc518754483aee8cfb26104cbb7

Running containers

[root@localhost vmdb]# docker ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED              STATUS              PORTS                                NAMES
a484b084144a        ansible/awx_task:latest   "/tini -- /bin/sh -c "   38 seconds ago       Up 34 seconds       8052/tcp                             awx_task
dde5dcd41bf8        ansible/awx_web:latest    "/tini -- /bin/sh -c "   43 seconds ago       Up 41 seconds       0.0.0.0:54321->8052/tcp              awx_web
c8959ca7cb53        memcached:alpine          "docker-entrypoint.sh"   About a minute ago   Up About a minute   11211/tcp                            memcached
0d83dcda9680        rabbitmq:3                "docker-entrypoint.sh"   About a minute ago   Up About a minute   4369/tcp, 5671-5672/tcp, 25672/tcp   rabbitmq
psachin commented 6 years ago

I'm aware of that. Just adding FYI into the thread. It was not intended anyone specific

akshay196 commented 6 years ago

I had removed all Docker images and pulled again for latest code. Then restarted evm server. Still facing same problem.

akshay196 commented 6 years ago

Today, we have tried to look into job details using tower-cli but awx in ManageIQ don't have default credentials (For awx, default credentials are admin/password)