mlsecproject / combine

Tool to gather Threat Intelligence indicators from publicly available sources
https://www.mlsecproject.org/
GNU General Public License v3.0
650 stars 179 forks source link

Memory error when running with enrichment? #146

Open juju4 opened 9 years ago

juju4 commented 9 years ago

While testing combine (pristine) in a vagrant box

$ ./combine.py -e
[...]
2015-05-16 22:48:05,051 - combine.winnower - ERROR - Could not determine address type for ckaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa listed as None
2015-05-16 22:48:05,065 - combine.winnower - ERROR - Could not determine address type for vendor.almsyar.com:8080 listed as FQDN
2015-05-16 22:48:05,080 - combine.winnower - INFO - Dumping results
Traceback (most recent call last):
  File "./combine.py", line 44, in <module>
    winnow('crop.json', 'crop.json', 'enrich.json')
  File "/home/vagrant/combine/winnower.py", line 203, in winnow
    e_data = json.dumps(enriched, indent=2, ensure_ascii=False).encode('utf8')
  File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps
    sort_keys=sort_keys, **kw).encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 210, in encode
    return ''.join(chunks)
MemoryError
$ free -m
             total       used       free     shared    buffers     cached
Mem:          2001        217       1784          1          1         77
-/+ buffers/cache:        139       1862
Swap:            0          0          0
alexcpsec commented 9 years ago

Hmm, that is a first. Do you mind sharing some details on the vagrant box you stood up (or even the config) so we could try replicating this?

juju4 commented 9 years ago

No problem, it's a pretty basic one

# -*- mode: ruby -*-
# vi: set ft=ruby :

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.box = "ubuntu/trusty64"

  config.vm.provider "virtualbox" do |v|
    v.memory = 2048
  end
  config.vm.network "public_network", bridge: 'eth0'
  config.vm.network "forwarded_port", guest: 80, host: 9880

  config.vm.synced_folder "/path/1", "/p"
end

and after virtualenv for requirements and execution combine without enrichment is ok if I switched to 4GB, it's ok.

Maybe adding a warning?

krmaxwell commented 9 years ago

I have noticed this as well, and it's an artifact of how we store everything in memory before writing to disk at the end of the job.