jertel / elastalert2

ElastAlert 2 is a continuation of the original yelp/elastalert project. Pull requests are appreciated!
https://elastalert2.readthedocs.org
Apache License 2.0
931 stars 287 forks source link

Not Sending PagerDuty Alerts Due to UnicodeEncodeError #1077

Closed JohnTrapp closed 1 year ago

JohnTrapp commented 1 year ago

Logs

File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1298, in alert
    return self.send_alert(matches, rule, alert_time=alert_time, retried=retried)
  File "/usr/local/lib/python3.10/site-packages/elastalert/elastalert.py", line 1375, in send_alert
    alert.alert(matches)
  File "/usr/local/lib/python3.10/site-packages/elastalert/alerters/pagerduty.py", line 94, in alert
    response = requests.post(
  File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 117, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/requests/adapters.py", line 440, in send
    resp = conn.urlopen(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/usr/local/lib/python3.10/site-packages/urllib3/connectionpool.py", line 398, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.10/site-packages/urllib3/connection.py", line 239, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/usr/local/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.10/http/client.py", line 1327, in _send_request
    body = _encode(body, 'body')
  File "/usr/local/lib/python3.10/http/client.py", line 166, in _encode
    raise UnicodeEncodeError(
UnicodeEncodeError: 'latin-1' codec can't encode character '\u21e2' in position 1621: Body ('⇢') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.

Steps to reproduce

  1. Generate a log with a message containing the character and include that message in an alert to PagerDuty.
  2. Note that the alert does not get sent to PagerDuty, effectively dropping the alert.

More info

This is happening with our spring java services that use reactor, as their stack traces use that character. Encoding aside, elastalert2 should catch and handle all exceptions when sending a POST, instead of only looking at the response code.

nsano-rururu commented 1 year ago

@JohnTrapp

You may need to:

Fix UnicodeEncodeError in PagerDutyAlerter https://github.com/Yelp/elastalert/pull/3182

If you have an environment where you can use PagerDuty, could you fix it, check it works, and submit a pull request if there is no problem? The files to be modified are: elastalert/alerters/pagerduty.py

nsano-rururu commented 1 year ago

@JohnTrapp

Please reply first to see if you can handle it. If you can't do it, ask someone else.

JohnTrapp commented 1 year ago

I think I can handle it, but it'll have to wait a few days. My only concern is that this probably effects more than the PagerDuty alert. Probably all alerts that work via POST request have this bug.

nsano-rururu commented 1 year ago

@ferozsalam @jertel

Should utf-8 be added as a fixed encoding to request.post? I think it is possible to set it externally as follows. Could you give us your opinion?

schema.yaml

pagerduty_encode: {type: string}

pagerduty.py

self.pagerduty_encode = self.rule.get('pagerduty_encode', 'utf-8')
data=json.dumps(payload, cls=DateTimeEncoder, ensure_ascii=False).encode(self.pagerduty_encode),
ferozsalam commented 1 year ago

Some thoughts on this in no particular order:

Again, internationalization and encodings are not an area of expertise, so I'm unsure if more configurability is required.

nsano-rururu commented 1 year ago

When inserting .encode("utf-8") individually, the following files are subject to modification

datadog.py
line.py
pagertree.py
gitter.py
chatwork.py
servicenow.py
victorops.py
httppost.py
googlechat.py
telegram.py
discord.py
dingtalk.py
alertmanager.py
gelf.py
httppost2.py
teams.py
rocketchat.py
alerta.py
thehive.py
mattermost.py
pagerduty.py
slack.py
opsgenie.py

@ferozsalam

Where do I make the following utf-8 mode changes? I don't usually use python, so I'm not familiar with it either.

Python 3.7 onwards allows us to set the use of UTF-8 globally. This might be something to investigate?

nsano-rururu commented 1 year ago

Set PYTHONUTF8=1 in environment variable?

ferozsalam commented 1 year ago

Set PYTHONUTF8=1 in environment variable?

For Docker and Kubernetes, I think this would be the way to do it – we can update the default Docker image to set this by default.

For use as a Python package without the Docker wrapper, maybe we can recommend users set -X utf8 in the documentation before running the package?

Either way, would like to hear what @jertel thinks about this idea.

nsano-rururu commented 1 year ago

@JohnTrapp

Please stop responding. Because the maintainer is discussing the response

jertel commented 1 year ago

Thank you all for contributing to this discussion! I've run a few tests and posted the results below:

$ python3
Type "help", "copyright", "credits" or "license" for more information.
>>> import json; json.dumps("test=some⇢",  ensure_ascii=False)
'"test=some⇢"'
>>> import json; json.dumps("test=some⇢",  ensure_ascii=False).encode('utf-8')
b'"test=some\xe2\x87\xa2"'
>>> import sys; sys.flags.utf8_mode
0
>>> quit()

$ python3 -X utf8
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; sys.flags.utf8_mode
1
>>> import json; json.dumps("test=some⇢",  ensure_ascii=False)
'"test=some⇢"'
>>> import json; json.dumps("test=some⇢",  ensure_ascii=False).encode('utf-8')
b'"test=some\xe2\x87\xa2"'
>>> 

Based on these findings we can conclude that the global setting will not help us here. Each alerter will need to be modified to encode the request body to UTF-8.

nsano-rururu commented 1 year ago
file Modification place
alerta.py data=alerta_payload,
alertmanager.py data=json.dumps([payload], cls=DateTimeEncoder),
datadog.py data=json.dumps(payload, cls=DateTimeEncoder),
dingtalk.py cls=DateTimeEncoder),
discord.py data=json.dumps(data),
gitter.py data=json.dumps(payload, cls=DateTimeEncoder),
googlechat.py data=json.dumps(message),
httppost.py data=json.dumps(payload, cls=DateTimeEncoder),
httppost2.py data=json.dumps(payload, cls=DateTimeEncoder),
mattermost.py data=json.dumps(payload, cls=DateTimeEncoder),
pagerduty.py data=json.dumps(payload, cls=DateTimeEncoder, ensure_ascii=False),
pagertree.py data=json.dumps(payload, cls=DateTimeEncoder),
rocketchat.py data=json.dumps(payload, cls=DateTimeEncoder),
servicenow.py data=json.dumps(payload, cls=DateTimeEncoder),
slack.py data=json.dumps(payload, cls=DateTimeEncoder),
teams.py data=json.dumps(payload, cls=DateTimeEncoder),
telegram.py data=json.dumps(payload, cls=DateTimeEncoder),
thehive.py data=alert_body,
victorops.py data=json.dumps(payload, cls=DateTimeEncoder),

chatwork.py, line.py

body ↓ body.encode('utf-8') ?

opsgenie.py

Since the post of json = post in opsgenie.py is an array, encode ('utf-8') for each value before setting to the array?

gelf.py

Gelf.py is already supported, so no modification is required

60 Line

bytes_msg = json.dumps(gelf_msg).encode('utf-8') + b'\x00'
jertel commented 1 year ago

I plan to release the next version of ElastAlert 2 later this week. I'm mentioning it only in case this is important enough that you might want to submit the PR before then, to have it included.

nsano-rururu commented 1 year ago

A UnicodeError is raised when the json.dump() function is used with the ensure_ascii=False argument and the data also contains non-ASCII characters. Only pagerduty.py uses the ensure_ascii=False argument. Only pagerduty.py is scheduled to be modified. If I have time this week, I'll fix it and open a pull request.