Closed JohnTrapp closed 1 year ago
@JohnTrapp
You may need to:
Fix UnicodeEncodeError in PagerDutyAlerter https://github.com/Yelp/elastalert/pull/3182
If you have an environment where you can use PagerDuty, could you fix it, check it works, and submit a pull request if there is no problem? The files to be modified are: elastalert/alerters/pagerduty.py
@JohnTrapp
Please reply first to see if you can handle it. If you can't do it, ask someone else.
I think I can handle it, but it'll have to wait a few days. My only concern is that this probably effects more than the PagerDuty alert. Probably all alerts that work via POST request have this bug.
@ferozsalam @jertel
Should utf-8 be added as a fixed encoding to request.post? I think it is possible to set it externally as follows. Could you give us your opinion?
schema.yaml
pagerduty_encode: {type: string}
pagerduty.py
self.pagerduty_encode = self.rule.get('pagerduty_encode', 'utf-8')
data=json.dumps(payload, cls=DateTimeEncoder, ensure_ascii=False).encode(self.pagerduty_encode),
Some thoughts on this in no particular order:
utf-8
by default.Again, internationalization and encodings are not an area of expertise, so I'm unsure if more configurability is required.
When inserting .encode("utf-8")
individually, the following files are subject to modification
datadog.py
line.py
pagertree.py
gitter.py
chatwork.py
servicenow.py
victorops.py
httppost.py
googlechat.py
telegram.py
discord.py
dingtalk.py
alertmanager.py
gelf.py
httppost2.py
teams.py
rocketchat.py
alerta.py
thehive.py
mattermost.py
pagerduty.py
slack.py
opsgenie.py
@ferozsalam
Where do I make the following utf-8 mode changes? I don't usually use python, so I'm not familiar with it either.
Python 3.7 onwards allows us to set the use of UTF-8 globally. This might be something to investigate?
Set PYTHONUTF8=1
in environment variable?
Set
PYTHONUTF8=1
in environment variable?
For Docker and Kubernetes, I think this would be the way to do it – we can update the default Docker image to set this by default.
For use as a Python package without the Docker wrapper, maybe we can recommend users set -X utf8
in the documentation before running the package?
Either way, would like to hear what @jertel thinks about this idea.
@JohnTrapp
Please stop responding. Because the maintainer is discussing the response
Thank you all for contributing to this discussion! I've run a few tests and posted the results below:
$ python3
Type "help", "copyright", "credits" or "license" for more information.
>>> import json; json.dumps("test=some⇢", ensure_ascii=False)
'"test=some⇢"'
>>> import json; json.dumps("test=some⇢", ensure_ascii=False).encode('utf-8')
b'"test=some\xe2\x87\xa2"'
>>> import sys; sys.flags.utf8_mode
0
>>> quit()
$ python3 -X utf8
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; sys.flags.utf8_mode
1
>>> import json; json.dumps("test=some⇢", ensure_ascii=False)
'"test=some⇢"'
>>> import json; json.dumps("test=some⇢", ensure_ascii=False).encode('utf-8')
b'"test=some\xe2\x87\xa2"'
>>>
Based on these findings we can conclude that the global setting will not help us here. Each alerter will need to be modified to encode the request body to UTF-8.
file | Modification place |
---|---|
alerta.py | data=alerta_payload, |
alertmanager.py | data=json.dumps([payload], cls=DateTimeEncoder), |
datadog.py | data=json.dumps(payload, cls=DateTimeEncoder), |
dingtalk.py | cls=DateTimeEncoder), |
discord.py | data=json.dumps(data), |
gitter.py | data=json.dumps(payload, cls=DateTimeEncoder), |
googlechat.py | data=json.dumps(message), |
httppost.py | data=json.dumps(payload, cls=DateTimeEncoder), |
httppost2.py | data=json.dumps(payload, cls=DateTimeEncoder), |
mattermost.py | data=json.dumps(payload, cls=DateTimeEncoder), |
pagerduty.py | data=json.dumps(payload, cls=DateTimeEncoder, ensure_ascii=False), |
pagertree.py | data=json.dumps(payload, cls=DateTimeEncoder), |
rocketchat.py | data=json.dumps(payload, cls=DateTimeEncoder), |
servicenow.py | data=json.dumps(payload, cls=DateTimeEncoder), |
slack.py | data=json.dumps(payload, cls=DateTimeEncoder), |
teams.py | data=json.dumps(payload, cls=DateTimeEncoder), |
telegram.py | data=json.dumps(payload, cls=DateTimeEncoder), |
thehive.py | data=alert_body, |
victorops.py | data=json.dumps(payload, cls=DateTimeEncoder), |
chatwork.py, line.py
body ↓ body.encode('utf-8') ?
opsgenie.py
Since the post of json = post in opsgenie.py is an array, encode ('utf-8') for each value before setting to the array?
gelf.py
Gelf.py is already supported, so no modification is required
60 Line
bytes_msg = json.dumps(gelf_msg).encode('utf-8') + b'\x00'
I plan to release the next version of ElastAlert 2 later this week. I'm mentioning it only in case this is important enough that you might want to submit the PR before then, to have it included.
A UnicodeError is raised when the json.dump() function is used with the ensure_ascii=False argument and the data also contains non-ASCII characters. Only pagerduty.py uses the ensure_ascii=False argument. Only pagerduty.py is scheduled to be modified. If I have time this week, I'll fix it and open a pull request.
Logs
Steps to reproduce
⇢
character and include that message in an alert to PagerDuty.More info
This is happening with our spring java services that use reactor, as their stack traces use that character. Encoding aside, elastalert2 should catch and handle all exceptions when sending a POST, instead of only looking at the response code.