Open Catsofsuffering opened 6 months ago
@Catsofsuffering Can you please provide the steps to reproduce this issue?
@Catsofsuffering Can you please provide the steps to reproduce this issue?
Due to our company's data security policy, I am unable to directly provide screenshots or logs to you. However, I can briefly describe the background and cause of this bug:
As mentioned earlier, I have developed an app that matches IOC threat intelligence, which requires sending a large amount of URL data to the corresponding API and then importing the returned results into Splunk. During this process, it seems that the error Error: field larger than field limit (131072)
occurred because the returned data (approximately close to 1 billion events) exceeded the limit of the csv.reader.
Although only about 40,000 records are filtered on average in the end, this issue still occurs. I'm not sure if this is a bug or an area that needs optimization, so I would like to consult you teams. Thank you.
The bug I found and how to repair it When developing a threat hunting application, I encountered a bug located at line 948 of
splunklib\searchcommands\search_command.py
. The relevant code snippet is as follows:The bug arises due to the use of the Python
csv
package and thereader
function. This leads to an error occurring during the processing of large amounts of data:After conducting a Google search, I found a solution on Stack Overflow (https://stackoverflow.com/questions/15063936/csv-error-field-larger-than-field-limit-131072). Implementing the following code snippet resolved the issue:
Splunk:
SDK:
Additional context Are there any risks or issues associated with my approach?