splunk / splunk-sdk-python

Splunk Software Development Kit for Python
http://dev.splunk.com
Apache License 2.0
698 stars 370 forks source link

xml parser error when streaming results via export function #370

Closed brummelbaer closed 3 years ago

brummelbaer commented 3 years ago

Describe the bug Hi everyone, I'm encountering a random bug when streaming data via the splunk sdk from the risk index to the script. The stream suddenly breaks and the script exits with an xml parsing error, as can be seen below. Unfortunately the bug does not happen every time and I can not reporoduce it consistently, but the following code snippet breaks in roughly 10% of the executions.

Any help is aprreciated :)

To Reproduce Steps to reproduce the behavior:

  1. execute the following script snippet
import splunklib.results as results
import splunklib.client as client

host='127.0.0.1'
port=8089
app='search'
username='user'
pw=''
scheme='https'
verify=False
autologin=True

service = client.Service(host=host, port=port, password=pw, app=app, username=username, scheme=scheme, verify=verify, autologin=autologin)
search_query = "search index=risk earliest=-24h@h latest=now | eval risk_time=_time | rename search_name as risk_search | rename orig_event_hash as risk_hash | table risk_object, risk_hash, risk_time, risk_score, risk_search, _raw"
search_args = {'enable_lookups': False, 'adhoc_search_level': 'fast'}
result_reader = results.ResultsReader(service.jobs.export(search_query, **search_args))
for result in result_reader:
    if isinstance(result, dict):   
        print(result)

Expected behavior I expect that all results are returned and the the API does not crash when querying a simple index.

Logs or Screenshots

Traceback (most recent call last):
  File "demo.py", line 16, in <module>
    for result in result_reader:
  File "/opt/splunk-i-es/.local/lib/python3.6/site-packages/splunklib/results.py", line 210, in next
    return next(self._gen)
  File "/opt/splunk-i-es/.local/lib/python3.6/site-packages/splunklib/results.py", line 219, in _parse_results
    for event, elem in et.iterparse(stream, events=('start', 'end')):
  File "/usr/lib64/python3.6/xml/etree/ElementTree.py", line 1221, in iterator
    yield from pullparser.read_events()
  File "/usr/lib64/python3.6/xml/etree/ElementTree.py", line 1296, in read_events
    raise event
  File "/usr/lib64/python3.6/xml/etree/ElementTree.py", line 1268, in feed
    self._parser.feed(data)
xml.etree.ElementTree.ParseError: mismatched tag: line 7862, column 2

Splunk (please complete the following information):

SDK (please complete the following information):

akaila-splunk commented 3 years ago

Hi @brummelbaer , We are not able to repro this issue on our end. We have used similar setup with SDK version 1.6.14,1.15(mentioned by you) and 1.6.16(latest). Please try with latest version of SDK(v1.6.16) and Splunk(v8.2.x) and let us know if you are still encountering this issue.

ashah-splunk commented 3 years ago

@brummelbaer Closing this issue due to no response on it. Please reopen if this is still an issue.