netbox-community / netbox

The premier source of truth powering network automation. Open source under Apache 2. Try NetBox Cloud free: https://netboxlabs.com/free-netbox-cloud/
http://netboxlabs.com/oss/netbox/
Apache License 2.0
16.14k stars 2.58k forks source link

RQ worker fails to create a webhook request #2619

Closed funzoneq closed 5 years ago

funzoneq commented 5 years ago

Environment

Steps to Reproduce

Create a webhook for circuits with the following settings:

Object type: Circuits -> Circuit Name: test_circuit_hook Type: Update URL: https://example.com/webhook/ HTTP Content type: JSON No secret Enabled SSL Verification enabled. -> Save

Create a test circuit:

cid: test-001 Provider: test provider Type: Internet Status: Active Tenant: Test tenant Install_date, commit_rate, description, comments empty No custom fields Save the circuit.

No go back and edit the circuit. Change the cid to test-002. This will trigger the webhook.

RQ Worker log:

./netbox/manage.py rqworker
11:45:30 Registering birth of worker Arnouds-MacBook-Pro.43412
11:45:30 RQ worker 'rq:worker:Arnouds-MacBook-Pro.43412' started, version 0.12.0
11:45:30 *** Listening on default...
11:45:30 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:45:30 Cleaning registries for queue: default
11:45:30 *** Listening on default...
11:45:30 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:50:56 default: extras.webhooks_worker.process_webhook(<Webhook: test_circuit_hook>, {'id': 1, 'cid': 'test-001', 'provider': OrderedDict([('id', 1), ('url', '/api/circuits/providers/1/'), ('name', 'Test provider'), ('slug', 'test-provider')]), 'type': OrderedDict([('id', 1), ('url', '/api/circuits/circuit-types/1/'), ('name', 'Internet'), ('slug', 'internet')]), 'status': {'value': 1, 'label': 'Active'}, 'tenant': OrderedDict([('id', 1), ('url', '/api/tenancy/tenants/1/'), ('name', 'Test Tenant'), ('slug', 'test-tenant')]), 'install_date': None, 'commit_rate': None, 'description': '', 'comments': '', 'tags': [], 'custom_fields': {}, 'created': '2018-11-01', 'last_updated': '2018-11-28T11:50:56.526702Z'}, <class 'circuits.models.Circuit'>, 2, '2018-11-28 11:50:56.724569') (a3afe90b-b790-41a1-a7df-24b7909d60b6)
11:50:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:50:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 35 seconds.
Traceback (most recent call last):
  File "./netbox/manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/__init__.py", line 371, in execute_from_command_line
    utility.execute()
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/__init__.py", line 365, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/base.py", line 288, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/base.py", line 335, in execute
    output = self.handle(*args, **options)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django_rq/management/commands/rqworker.py", line 100, in handle
    w.work(burst=options.get('burst', False))
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 493, in work
    self.execute_job(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 662, in execute_job
    self.fork_work_horse(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 599, in fork_work_horse
    self.main_work_horse(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 677, in main_work_horse
    success = self.perform_job(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 781, in perform_job
    self.prepare_job_execution(job)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 706, in prepare_job_execution
    registry.add(job, timeout, pipeline=pipeline)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/registry.py", line 47, in add
    return pipeline.zadd(self.key, score, job.id)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/redis/client.py", line 2263, in zadd
    for pair in iteritems(mapping):
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/redis/_compat.py", line 123, in iteritems
    return iter(x.items())
AttributeError: 'int' object has no attribute 'items'
11:50:57 Moving job to 'failed' queue (work-horse terminated unexpectedly; waitpid returned 256)
11:50:57 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:50:57 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:50:57 *** Listening on default...
11:50:57 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
^C11:53:23 Got signal SIGINT
11:53:23 Warm shut down requested
11:53:23 Registering death

Expected Behavior

A POST call to be made to https://example.com/webhook/

Observed Behavior

Stacktrace in the rq worker. Job turned to failed.

Also tried with circuits -> Provider webhook, but it fails in a similar way:

11:13:25 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:17:00 default: extras.webhooks_worker.process_webhook(<Webhook: test_provider_hook>, {'id': 1, 'name': 'Test provider', 'slug': 'test-provider', 'asn': 1234, 'account': '123444', 'portal_url': '', 'noc_contact': '', 'admin_contact': '', 'comments': '', 'tags': [], 'custom_fields': {}, 'created': '2018-11-01', 'last_updated': '2018-11-28T11:17:00.029476Z'}, <class 'circuits.models.Provider'>, 2, '2018-11-28 11:17:00.178530') (6c460a35-261c-4b77-9c73-113979d2c32d)
11:17:00 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:17:00 Sent heartbeat to prevent worker timeout. Next one should arrive within 35 seconds.
Traceback (most recent call last):
  File "./manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/__init__.py", line 371, in execute_from_command_line
    utility.execute()
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/__init__.py", line 365, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/base.py", line 288, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django/core/management/base.py", line 335, in execute
    output = self.handle(*args, **options)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/django_rq/management/commands/rqworker.py", line 100, in handle
    w.work(burst=options.get('burst', False))
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 493, in work
    self.execute_job(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 662, in execute_job
    self.fork_work_horse(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 599, in fork_work_horse
    self.main_work_horse(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 677, in main_work_horse
    success = self.perform_job(job, queue)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 781, in perform_job
    self.prepare_job_execution(job)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/worker.py", line 706, in prepare_job_execution
    registry.add(job, timeout, pipeline=pipeline)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/rq/registry.py", line 47, in add
    return pipeline.zadd(self.key, score, job.id)
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/redis/client.py", line 2263, in zadd
    for pair in iteritems(mapping):
  File "/Users/avermeer/Documents/original-netbox/virt/lib/python3.6/site-packages/redis/_compat.py", line 123, in iteritems
    return iter(x.items())
AttributeError: 'int' object has no attribute 'items'
11:17:00 Moving job to 'failed' queue (work-horse terminated unexpectedly; waitpid returned 256)
11:17:00 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
11:17:00 Sent heartbeat to prevent worker timeout. Next one should arrive within 420 seconds.
lampwins commented 5 years ago

Thanks for proactively posting a pip freeze. It seems the culprit is an upstream issue between redis and rq https://github.com/rq/rq/issues/1014

From that issue (which is still open at this time) there is a workaround.

Until this is fixed in RQ you can just explicitly install redis in your project, before RQ:

redis==2.10.6
rq==0.12.0

Just pip install those, in that order and restart the rqworker process.

funzoneq commented 5 years ago

@lampwins I can confirm that this resolved the issue above.

hellosteadman commented 5 years ago

Unfortunately not... I'm having the same issue with "waitpid returned 256" and have confirmed it's not memory-related, nor is it to do with the Redis and RQ versions (I have the above versions).

This happens under Kubernetes on DigitalOcean, not locally (same Docker image on both).

Would really welcome some guidance here, thanks!

awlx commented 5 years ago

Just hit the same problem. With the docker image from https://hub.docker.com/r/ninech/netbox/ and netbox running on K8s after upgrade from 2.4.6 to 2.4.8.

Error: https://paste.fedoraproject.org/paste/gbbhfe2V3n-dBaLL3gQ8fg

So the error indeed seems to be related to the upstream issue.