csirtgadgets / bearded-avenger

CIF v3 -- the fastest way to consume threat intelligence
https://csirtgadgets.com/collective-intelligence-framework
Mozilla Public License 2.0
179 stars 52 forks source link

cif-router at 100% on one core -- remains after restart - cif client delays. #263

Closed ventz closed 7 years ago

ventz commented 7 years ago

After installing CIFv3 (release: 3.0.0a16.tar.gz), I noticed that the CPU usage on one of the cores is 100% all the time (from the cif-router processes) and so far it has been for > 24 hours.

1542 cif       20   0  232456  54220   7668 R 100.0  0.3 723:11.87 cif-router                                                                                                     
 1292 cif       20   0  563632  42904  10104 S   0.3  0.3   0:09.22 cif-httpd                                                                                                      
 1546 cif       20   0  346896  32972   6560 S   0.3  0.2   4:01.03 cif-router                                                                                                     
    1 root      20   0  185488   6084   3948 S   0.0  0.0   0:02.18 systemd                   

Yesterday there were multiple cif-router's and the rest were < 1%, while one was at 100%. This is even after stopping/restarting/etc, and even rebooting the system.

For the stop+start, I am using this order (not sure if it matters?):

# service cif-httpd stop
# service cif-router stop
r# service csirtg-smrt stop

# service cif-router start
# service cif-httpd start
# service csirtg-smrt start

After you stop, it's fine for a few minutes, and then the same thing happens.

2017-02-24 16:42:24,676 - INFO - cifsdk.client.client[119][MainThread] - running ping
roundtrip: 0.0105819702148 ms
roundtrip: 0.0045919418335 ms
roundtrip: 0.00449109077454 ms
roundtrip: 0.00470805168152 ms

After less than a minute, same thing:

4274 cif       20   0  227836  50136   7560 R  97.7  0.3   0:15.41 cif-router  

and response times start to hang again:

2017-02-24 16:44:19,802 - INFO - cifsdk.client.client[119][MainThread] - running ping
roundtrip: 13.75141716 ms
...hangs...

By hangs/delays, I mean this:

$ cif -p

2017-02-24 16:30:21,358 - INFO - cifsdk.client.client[119][MainThread] - running ping
...hangs...
roundtrip: 66.7338149548 ms
...hangs...

Not sure what's causing this.

It's running on an AWS VM.

Possibly related, yesterday I was running tcpdump, and I saw massive amounts of DNS traffic, but I am assuming this is related to the list ingestion. At this point, I am not seeing any new DNS traffic, but the process is still at 100%.

The last bit, noticed this before I killed the process the last time:

cif       1272  0.0  0.2  94212 37008 ?        Ss   04:17   0:00 /usr/bin/python /usr/local/bin/csirtg-smrt --remember --service --client cif --service --fireball --delay 2
cif       1753  0.5  1.0 235328 165020 ?       S    04:19   3:45 /usr/bin/python /usr/local/bin/csirtg-smrt --remember --service --client cif --service --fireball --delay 2
cif       3917  0.0  0.0      0     0 ?        Z    16:19   0:00 [csirtg-smrt] <defunct>

Not sure if there should be multiple and one being defunct?

Thanks.

wesyoung commented 7 years ago

did you see any of these errors:

https://github.com/csirtgadgets/csirtg-smrt-py/issues/157

?

$ sudo journalctl -u csirtg-smrt.service

(i think?)

ventz commented 7 years ago

Yea just replied there with my errors.

On Fri, Feb 24, 2017 at 12:29 PM, Wes notifications@github.com wrote:

did you see any of these errors:

csirtgadgets/csirtg-smrt-py#157 https://github.com/csirtgadgets/csirtg-smrt-py/issues/157

?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/csirtgadgets/bearded-avenger/issues/263#issuecomment-282351750, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3AMC1qm15ZCVVVT3yZKzzMesx9sRorks5rfxNxgaJpZM4MLd0c .

-- -Ventz http://vpetkov.net

wesyoung commented 7 years ago

try this:

$ sudo pip install csirtg-smrt==0.0.0a29 csirtg-indicator==0.0.0b7
$ sudo systemctl restart csirtg-smrt.service

see if that doesn't clean a few things up (not saying that will fix cif-router, but may weed through a few things for us).

ventz commented 7 years ago

Weird enough, with the exports flags in the cif-router.service

Environment=CIF_HUNTER_TRACE=1
Environment=CIF_STORE_TRACE=1

It does not seem to happen/crash/delay. (this makes no sense).

After removing them, reloading the module, stoping and starting, within 2 minutes it happened again:

2014 cif       20   0  227284  47820   7624 R 100.3  0.3   0:08.96 cif-router                                                                                                     
 2015 cif       20   0  249692  40496  16500 S   5.6  0.2   0:01.44 cif-router                                                                                                     
 2017 cif       20   0  346876  30528   6448 S   4.7  0.2   0:00.93 cif-router                                                                                                     
 2016 cif       20   0  247980  39680  17200 S   4.3  0.2   0:01.31 cif-router   

I'll try the pip install and let you know.

ventz commented 7 years ago

Ok - so did the pip install and restart. (You are right, it did not fix the cif-router.)

Then did a stop of everything and started it again.

Worked for a bit again, and after a while same thing:

2628 cif       20   0  230568  51092   7624 R  96.7  0.3   0:24.83 cif-router                                                                                                                                                                                               
 2621 cif       20   0  194516  32096  10128 S   0.3  0.2   0:00.77 cif-router                                                                                                     
 2683 cif       20   0  414628  36972  10196 S   0.3  0.2   0:00.38 cif-httpd  

Logs:

Feb 24 20:09:47 cif cif-httpd[2683]: 2017-02-24 20:09:47,379 - INFO - werkzeug[87][Thread-44] - #033[32m127.0.0.1 - - [24/Feb/2017 20:09:47] "GET /ping?write=1 HTTP/1.1" 200 -#033[
0m
Feb 24 20:09:48 cif csirtg-smrt[2720]: 2017-02-24 20:09:48,256 - INFO - csirtg_smrt.archiver[105] - #033[32mCaching archived indicators for provider alexa.com#033[0m
Feb 24 20:09:48 cif csirtg-smrt[2720]: 2017-02-24 20:09:48,274 - INFO - csirtg_smrt.archiver[115] - #033[32mCached provider alexa.com in memory, 0 objects#033[0m
Feb 24 20:09:48 cif cif-httpd[2683]: 2017-02-24 20:09:48,332 - DEBUG - cif-httpd[64][Thread-46] - #033[35mcontent-length: 34994#033[0m
Feb 24 20:09:48 cif cif-httpd[2683]: 2017-02-24 20:09:48,333 - INFO - cif-httpd[66][Thread-46] - #033[32mfireball mode#033[0m
Feb 24 20:09:49 cif cif-router[2621]: 2017-02-24 20:09:49,055 - INFO - cif.router[164][MainThread] - #033[32mprocessing 0.81 msgs per 123.9 sec#033[0m
Feb 24 20:09:50 cif cif-router[2621]: 2017-02-24 20:09:50,427 - INFO - cif.router[164][MainThread] - #033[32mprocessing 72.93 msgs per 1.37 sec#033[0m
Feb 24 20:09:51 cif cif-router[2621]: 2017-02-24 20:09:51,576 - INFO - cif.router[164][MainThread] - #033[32mprocessing 87.01 msgs per 1.15 sec#033[0m
Feb 24 20:09:52 cif cif-router[2621]: 2017-02-24 20:09:52,387 - INFO - cif.router[164][MainThread] - #033[32mprocessing 123.44 msgs per 0.81 sec#033[0m
Feb 24 20:09:53 cif cif-router[2621]: 2017-02-24 20:09:53,841 - INFO - cif.router[164][MainThread] - #033[32mprocessing 68.76 msgs per 1.45 sec#033[0m
Feb 24 20:09:54 cif cif-router[2621]: 2017-02-24 20:09:54,448 - INFO - cif.router[164][MainThread] - #033[32mprocessing 164.83 msgs per 0.61 sec#033[0m
Feb 24 20:09:54 cif cif-router[2621]: 2017-02-24 20:09:54,945 - INFO - cif.router[164][MainThread] - #033[32mprocessing 201.45 msgs per 0.5 sec#033[0m
Feb 24 20:09:56 cif cif-router[2621]: 2017-02-24 20:09:56,698 - INFO - cif.router[164][MainThread] - #033[32mprocessing 57.05 msgs per 1.75 sec#033[0m
Feb 24 20:09:59 cif cif-router[2621]: 2017-02-24 20:09:59,498 - INFO - cif.router[164][MainThread] - #033[32mprocessing 35.72 msgs per 2.8 sec#033[0m
Feb 24 20:10:00 cif cif-router[2621]: 2017-02-24 20:10:00,476 - INFO - cif.router[164][MainThread] - #033[32mprocessing 102.29 msgs per 0.98 sec#033[0m
Feb 24 20:10:07 cif cif-httpd[2683]: 2017-02-24 20:10:07,129 - INFO - werkzeug[87][Thread-46] - #033[32m127.0.0.1 - - [24/Feb/2017 20:10:07] "POST /indicators HTTP/1.1" 201 -#033[0
m
Feb 24 20:10:07 cif named[1267]: validating osint.bambenekconsulting.com/A: no valid signature found
Feb 24 20:10:07 cif named[1267]: validating osint.bambenekconsulting.com/AAAA: no valid signature found
Feb 24 20:10:07 cif csirtg-smrt[2720]: 2017-02-24 20:10:07,343 - INFO - csirtg_smrt.archiver[105] - #033[32mCaching archived indicators for provider osint.bambenekconsulting.com#03
3[0m
Feb 24 20:10:07 cif csirtg-smrt[2720]: 2017-02-24 20:10:07,345 - INFO - csirtg_smrt.archiver[115] - #033[32mCached provider osint.bambenekconsulting.com in memory, 0 objects#033[0m
Feb 24 20:10:07 cif cif-httpd[2683]: 2017-02-24 20:10:07,368 - DEBUG - cif-httpd[64][Thread-48] - #033[35mcontent-length: 24839#033[0m
Feb 24 20:10:07 cif cif-httpd[2683]: 2017-02-24 20:10:07,369 - INFO - cif-httpd[66][Thread-48] - #033[32mfireball mode#033[0m
wesyoung commented 7 years ago

released a17, should solve most of these issues. gonna close this for now, re-open new issues as the come up... (ty for the feedback!)