mbegan / Okta-Identity-Cloud-for-Splunk

Public REPO for splunkbase app
https://splunkbase.splunk.com/app/3682/
Other
19 stars 13 forks source link

double-byte / non-ascii characters escaped by json.dumps #28

Open mbegan opened 3 years ago

mbegan commented 3 years ago

https://github.com/mbegan/Okta-Identity-Cloud-for-Splunk/blob/b68a785c0cdc49a0be1db4f940b92634f94cd60b/bin/input_module_okta_identity_cloud.py#L205

As referenced in a Splunk Community post non ascii characters are being escaped prior to indexing causing search problems.

mbegan commented 3 years ago

The json docs from Python indicate that, by default, the json.dumps method will escape non-ascii characters.

If ensure_ascii is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is false, these characters will be output as-is.

The proposed fix would add a configuration parameter to allow a Splunk administrator to toggle this default behavior of escaping non-ascii char.

json_ensure_ascii = bool(_getSetting('json_ensure_ascii'))
data = json.dumps(item, ensure_ascii=json_ensure_ascii)