Implementation of batch mode for MISP feed output to resolve slow performance when queue has many events. The existing code is actually prone to performance issues:
The following code is being executed for every event in the queue, making the bot extremely slow as events arrive and feed becomes larger:
feed_output = self.current_event.to_feed(with_meta=False)
with self.current_file.open('w') as f: # File opened for every event
json.dump(feed_output, f)
feed_meta_generator(self.output_dir) # Metadata updated on every event
Motivation
We are trying to create feeds based on Alienvault OTX pulses including thousands of IOCs per day. This is basically not possible with the current MISP feed output bot performance.
Fix
With this MR, batched feed creation is introduced. The user can now configure the batch size using the batch_size parameter. Batch functionality is based on the actual internal queue used from the bot.
Benchmark
On an average server, before this improvement a feed of 8k events required several hours to be created while now requires less than 5 minutes (depends on the available resources).
Description
Implementation of batch mode for MISP feed output to resolve slow performance when queue has many events. The existing code is actually prone to performance issues:
The following code is being executed for every event in the queue, making the bot extremely slow as events arrive and feed becomes larger:
Motivation
We are trying to create feeds based on Alienvault OTX pulses including thousands of IOCs per day. This is basically not possible with the current MISP feed output bot performance.
Fix
With this MR, batched feed creation is introduced. The user can now configure the batch size using the
batch_size
parameter. Batch functionality is based on the actual internal queue used from the bot.Benchmark
On an average server, before this improvement a feed of 8k events required several hours to be created while now requires less than 5 minutes (depends on the available resources).