StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.06k stars 745 forks source link

MongoDB can only handle up to 8-byte ints #3365

Open hagay3 opened 7 years ago

hagay3 commented 7 years ago

Hi, I have an issue with stackstorm. I configured a webhook and http request should trigger StackStorm action, but there is some parse error. Seems like its not mongo error, maybe some internal client error. OverflowError: MongoDB can only handle up to 8-byte ints

The whole log From the log: st2rulesengine.log

2017-04-19 09:41:38,818 140175459950832 ERROR consumers [-] StagedQueueConsumer failed to process message: {'trace_context': <st2common.models.api.trace.TraceContext object at 0x7f7d2455a590>, 'trigger': {'uid': u'trigger:core:940aea90-0f2e-4257-b7df-10b0b6994567:39ad46681f6d890b5585141e7a3c4252', 'parameters': {u'url': u'******_one_cluster_node_down_PD'}, 'name': u'940aea90-0f2e-4257-b7df-10b0b6994567', 'type': u'core.st2.webhook', 'id': '58dd19eea18bd533a505d984', 'pack': u'core'}, 'payload': {'body': {'status': 'firing', 'groupLabels': {'cluster': 'none', 'alertname': '******_one_cluster_node_down_PD_ST', 'instance': '*******-******.******.******'}, 'groupKey': 11168737559087072609L, 'commonAnnotations': {'description': '[StackStorm] ****** node is down in cluster ******-*** at datacenter ****** https://***********', 'summary': '[StackStorm] One main ****** cluster node is down ********* at datacenter ******'}, 'alerts': [{'status': 'firing', 'labels': {'datacenter': '******', 'hook_url': '***********', 'target': '*********', 'cluster': 'none', 'environment': 'none', 'instance': '*****-******.******.******', 'job': 'node-******-******', 'servicename': '******-******', 'monitor_source': '******.nydc1', 'alertname': '******_one_cluster_node_down_PD_ST', 'monitor_owner': '******', 'owner': '******-******', 'action': 'none'}, 'endsAt': '0001-01-01T00:00:00Z', 'generatorURL': '*************', 'startsAt': '2017-04-19T08:48:38.59-04:00', 'annotations': {'description': '[StackStorm] ****** node is down in cluster ******-****** at datacenter ****** https://********.******/dashboard/db/ops-data-******-overview?from=now-5m&to=now&var-instance=*******-******.******.******', 'summary': '[StackStorm] One main ****** cluster node is down ******-****** at datacenter ******'}}], 'version': '3', 'receiver': 'webhook_router', 'externalURL': 'http://alert-40005-prod-******.******.******:9093', 'commonLabels': {'datacenter': '******', 'hook_url': 'https://www*********/api/v1/webhooks/******_one_cluster_node_down_PD?st2-api-key=******', 'target': '******', 'cluster': 'none', 'environment': 'none', 'instance': '*****', 'job': 'node-******-******', 'servicename': '******-******', 'monitor_source': '******.nydc1', 'alertname': '******_one_cluster_node_down_PD_ST', 'monitor_owner': '******', 'owner': '******-******', 'action': 'none'}}, 'headers': {'X-Request-Id': '327fb4aa-4b1c-4e99-97b2-a6e89703e85e', 'Accept-Encoding': 'gzip', 'X-Forwarded-For': '************', 'Content-Length': '2610', 'User-Agent': 'Go-http-client/1.1', 'Host': '******.******,*******.******', 'X-Real-Ip': '************', 'Content-Type': 'application/json'}}}
Traceback (most recent call last):
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/transport/consumers.py", line 85, in process
    response = self._handler.pre_ack_process(body)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2reactor/rules/worker.py", line 56, in pre_ack_process
    raise_on_no_trigger=True)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2reactor/container/utils.py", line 80, in create_trigger_instance
    return TriggerInstance.add_or_update(trigger_instance)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/persistence/base.py", line 173, in add_or_update
    model_object = cls._get_impl().add_or_update(model_object)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/models/db/__init__.py", line 313, in add_or_update
    instance.save()
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/mongoengine/document.py", line 340, in save
    object_id = collection.save(doc, **write_concern)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 2445, in save
    check_keys, manipulate, write_concern)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 562, in _insert
    check_keys, manipulate, write_concern, op_id, bypass_doc_val)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 543, in _insert_one
    check_keys=check_keys)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/pool.py", line 424, in command
    self._raise_connection_failure(error)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/pool.py", line 552, in _raise_connection_failure
    raise error
**OverflowError: MongoDB can only handle up to 8-byte ints**
estee-tew commented 7 years ago

https://stackstorm.slack.com/archives/community/p1492666517957427

estee-tew commented 7 years ago

https://stackstorm.slack.com/archives/community/p1492666563963757

hagay3 commented 7 years ago

Eventually ended up adding LUA script to nginx for removing the key.

LindsayHill commented 7 years ago

@hagay3 are you able to provide an example of a webhook & rule that triggers this?

hagay3 commented 7 years ago

@LindsayHill I don't have it all, you can try to submit payload with a very big integer and then you get the above exception. integer for example: 10355138598665546921

LindsayHill commented 7 years ago

OK, here's a way to reproduce it: cat /opt/stackstorm/packs/examples/rules/sample_rule_with_webhook.yaml

---
    name: "sample_rule_with_webhook"
    pack: "examples"
    description: "Sample rule dumping webhook payload to a file."
    enabled: true

    trigger:
        type: "core.st2.webhook"
        parameters:
            url: "sample"

    criteria:
        trigger.body.name:
            pattern: "st2"
            type: "equals"

    action:
        ref: "core.local"
        parameters:
            cmd: "echo \"{{trigger.body}}\" >> ~/st2.webhook_sample.out"

curl -k https://localhost/api/v1/webhooks/sample -d '{"foo": 10355138598665546921, "name": "st2"}' -H 'Content-Type: application/json' -H 'X-Auth-Token: ma-token'

Results in errors like this in st2rulesengine.log:

2017-06-20 23:36:41,885 140689684709936 ERROR consumers [-] StagedQueueConsumer failed to process message: {'trace_context': <st2common.models.api.trace.TraceContext object at 0x7ff4de6b0f90>, 'trigger': {'uid': u'trigger:core:543179dc-de8b-4e9c-a551-f73ec9feb38a:e28a4c1e331b040397255d188bf7ad86', 'parameters': {u'url': u'sample'}, 'ref': u'core.543179dc-de8b-4e9c-a551-f73ec9feb38a', 'pack': u'core', 'type': u'core.st2.webhook', 'id': '594964fcbb0f184c0ecbbb0b', 'name': u'543179dc-de8b-4e9c-a551-f73ec9feb38a'}, 'payload': {'body': {u'foo': 10355138598665546921L, u'name': u'st2'}, 'headers': {'X-Request-Id': '1e5960fb-35bc-41f9-bc8a-77b24744d717', 'X-Forwarded-For': '127.0.0.1', 'Content-Length': '44', 'Accept': '*/*', 'User-Agent': 'curl/7.35.0', 'Host': 'localhost,localhost', 'X-Real-Ip': '127.0.0.1', 'X-Auth-Token': '602d83b6f5864071812a0f6992c67ff2', 'Content-Type': 'application/json'}}}
Traceback (most recent call last):
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/transport/consumers.py", line 85, in process
    response = self._handler.pre_ack_process(body)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2reactor/rules/worker.py", line 56, in pre_ack_process
    raise_on_no_trigger=True)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2reactor/container/utils.py", line 80, in create_trigger_instance
    return TriggerInstance.add_or_update(trigger_instance)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/persistence/base.py", line 173, in add_or_update
    model_object = cls._get_impl().add_or_update(model_object)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/st2common/models/db/__init__.py", line 314, in add_or_update
    instance.save()
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/mongoengine/document.py", line 340, in save
    object_id = collection.save(doc, **write_concern)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 2445, in save
    check_keys, manipulate, write_concern)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 562, in _insert
    check_keys, manipulate, write_concern, op_id, bypass_doc_val)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/collection.py", line 543, in _insert_one
    check_keys=check_keys)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/pool.py", line 424, in command
    self._raise_connection_failure(error)
  File "/opt/stackstorm/st2/local/lib/python2.7/site-packages/pymongo/pool.py", line 552, in _raise_connection_failure
    raise error
OverflowError: MongoDB can only handle up to 8-byte ints
LindsayHill commented 7 years ago

Possible workaround is to cast as string, e.g. this works:

curl -k https://localhost/api/v1/webhooks/sample -d '{"foo": "10355138598665546921", "name": "st2"}' -H 'Content-Type: application/json' -H 'X-Auth-Token: 602d83b6f5864071812a0f6992c67ff2'

Not sure if we can deal with this with standard webhook sensor or not. Probably requires custom sensor.

hagay3 commented 7 years ago

@LindsayHill , you are right this is how to reproduce the bug. The big integer coming from an automated system so I can't change it before it comes to stackstorm. I added a LUA script for nginx and removed the key to workaround that.

Back in the days I look on the code and there is a place where you parse the payload values, if you will parse correctly the value issue should be fixed .

And mongo can handle such values https://docs.mongodb.com/manual/core/shell-types/#numberlong