Closed ashurack closed 1 year ago
@ashurack we are unable to reproduce the issue and were able to successfully upload the given csv file. below is the screenshot for the same
Note- here we are using a Streaming CSC. No modification is being applied to the data read from the csv file.
Request you to share your CSC if possible. Also do let us know if something is being missed from steps to reproducing the issue
Looks like the csv file I uploaded is 100% valid UTF. I'll try to get a sample that will trigger the decoding issue this week. Message me on Splunk Slack in the meantime for more details.
Hi @ashah-splunk, same error here (Splunk 9.0.1, Debian GNU/Linux 11, Python 3.7.11). Loading events from CSV doesn't work as expected (non UTF characters are parsed to UTF). I am using botsv3 dataset and getting the same problem as @ashurack using vt4splunk
streaming command from VT4Splunk. The suggested solution by @ashurack seems to work properly.
Event example:
Error:
search.log:
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: UnicodeDecodeError at "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/six.py", line 917 : 'utf-8' codec can't decode byte 0x8e in position 294568: invalid start byte
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: Traceback:
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/searchcommands/search_command.py", line 780, in _process_protocol_v2
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: self._execute(ifile, None)
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/searchcommands/streaming_command.py", line 55, in _execute
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: SearchCommand._execute(self, ifile, self.stream)
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/searchcommands/search_command.py", line 855, in _execute
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: self._execute_v2(ifile, process)
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/searchcommands/search_command.py", line 948, in _execute_v2
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: result = self._read_chunk(istream)
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/searchcommands/search_command.py", line 912, in _read_chunk
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: return metadata, six.ensure_str(body)
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: File "/opt/splunk/etc/apps/TA-virustotal-app/bin/ta_virustotal_app/aob_py3/splunklib/six.py", line 917, in ensure_str
ERROR ChunkedExternProcessor [1549763 ChunkedExternProcessorStderrLogger] - stderr: s = s.decode(encoding, errors)
ERROR ChunkedExternProcessor [1549758 localCollectorThread] - EOF while attempting to read transport header read_size=0
ERROR ChunkedExternProcessor [1549758 localCollectorThread] - Error in 'vt4splunk' command: External search command exited unexpectedly with non-zero error code 1.
ERROR LocalCollector [1549758 localCollectorThread] - SearchMessage orig_component=LocalCollector sid= message_key= message=Error in 'vt4splunk' command: External search command exited unexpectedly with non-zero error code 1.
@pabloperezj sorry for the delay in response. We were able to reproduce the issue using botsv3 dataset. Also during our verification we found that issue occurs only for certain specific non-utf8 characters. We are validating the change suggested by @ashurack and accordingly will make the changes in the SDK.
We will update you know once we have a new SDK release available with the change.
@ashurack ,@pabloperezj the fix is available in the latest Python SDK v1.7.4, request you to pull the latest SDK release. Please re-open the issue if the issue still persists. Thanks!
Describe the bug Custom search commands exception out when non UTF-8 event data is present in the search pipeline
To Reproduce
Expected behavior splunk-sdk-python (and all other potentially impacted SDK's) should handle encoding/decoding in the same manner as Splunk Core.
Logs or Screenshots
Not working broken_search.log
After patching six.py
Splunk (please complete the following information):
SDK (please complete the following information):
Additional context My patch - to get my command working ASAP - was to change
errors='strict'
toerrors='replace'
here. I chose replace since it mimic's the functionality of Splunk. I didn't touch any other instances oferrors='strict'
and only tested this againstStreamingCommand
.This bug is not limited to the
inputlookup
command but it is the easiest way to reproduce.