hammackj / risu

Risu is Nessus parser, that converts the generated reports into a ActiveRecord database, this allows for easy report generation and vulnerability verification.
http://hammackj.github.io/risu
MIT License
63 stars 20 forks source link

VERY slow load speed #70

Closed ebdavison closed 9 years ago

ebdavison commented 10 years ago

I have been working with the updated version of risu lately and found that the speed is VERY slow. On a file that used to take 40 minutes to load (which seemed like a long time), the load is now taking over 3 hours.

My current file being loaded is much bigger than the one just mentioned and I feel this will make me wait overnight until the load is complete. This puts a but hamper in my workflow as I use this regularly during my scan process to turn the results around to the customer and execs for feedback and remediation.

What can I do to speed the load? What do you need from me to help diagnose the load time?

ebdavison commented 10 years ago

Sorry, here is my version info: risu: 1.6.3 Ruby Version: 1.9.3 Rubygems Version: 2.2.2 I am using Kali Linux as my OS on mysql.

hammackj commented 10 years ago

What kind of database?

On Mar 17, 2014, at 3:18 PM, ebdavison notifications@github.com wrote:

Sorry, here is my version info: risu: 1.6.3 Ruby Version: 1.9.3 Rubygems Version: 2.2.2

— Reply to this email directly or view it on GitHub.

ebdavison commented 10 years ago

mysql Ver 14.14 Distrib 5.5.35, for debian-linux-gnu (i686) using readline 6.2

ebdavison commented 10 years ago

Seems to be averaging 1.5 minutes/host or about 200 items per minute with the rough figures I have been checking.

hammackj commented 10 years ago

How large is the nessus xml? Is the database on the same host as risu?

Jacob Hammack Jacob.Hammack@Hammackj.com (210) 355-0036 http://www.hammackj.com

On Mar 17, 2014, at 3:23 PM, ebdavison notifications@github.com wrote:

Seems to be averaging 1.5 minutes/host or about 200 items per minute with the rough figures I have been checking.

— Reply to this email directly or view it on GitHub.

ebdavison commented 10 years ago

The .nessus file is 257MB. The mysql DB is local. Would sqlite be any faster? I normally use the DB for other work but if sqlite is faster, I might try it out. I started the load at 12:32 and it is now 4:26 and the load is still processing. I think there are 800+ hosts and the DB has 426 hosts right now. Is there any way to get progress as the file is processed? Debug does not seem to provide that type of status update.

hammackj commented 10 years ago

One thing you might try is to disable the indexes that Risu creates, during the --create-table process. This will slow down the report generation later on unless you add them back later. Let me know how that works for you.

Not much else I can think of to speed the process up, it is processing a lot of data. One thing on my todo list is to import the database from Nessus directly, but I haven't had any time to spend on that yet.

ebdavison commented 10 years ago

The total run took 26803 seconds to load 883 hosts. I will try it without the indexes next time.

Another idea that may speed things up, possibly, is to not load all of the plugins for every item. Does the load cache the plugin data somewhere so as to not have to check the DB each time a new findingitem is parsed?

ebdavison commented 10 years ago

Where are the table and index creation statements?
Is there a way to dump the SQL for the table creations so I can edit it prior to run?

hammackj commented 10 years ago
DROP INDEX 'index_items_on_host_id' ON items;
DROP INDEX 'index_items_on_plugin_id' ON items;
DROP INDEX 'index_references_on_plugin_id' ON references;

That should drop those indexes, the names might be different you might have to double check.

As for the parsing, everything is handled as it gets it in the .nessus file. So it will create or lookup each plugin_id during the parsing to keep the references.

I haven't tested against a .nessus file that large yet, How much RAM does your machine have allocated for mysql?

ebdavison commented 10 years ago

The computer has a total of 8G RAM but it is not specifically dedicated to mysql; mysql should use what it needs.

I have dropped the indexes on a different data load I am needing to run in a different target database. This .nessus file is 35M. I will let it run and check the speed compared to the big load and see if the hosts/minute rate is any better without those indexes.

On the parsing, I was thinking that if you were to cache the plugin_id in RAM then you could check the cache each time you hit it and could save a trip to the DB to check for the plugin_id in the database. That would also help performance some and maybe this is an execution flag. I would like to see in the long run a separate database with the plugins that could be updated from time to time so multiple scans could share this as a repository rather than having to load it on every run.

Or maybe an option to skip the plugins entirely with the above in mind.

ebdavison commented 10 years ago

Looking at this latest load, the file is 35M and 215 hosts. The load without indexes took 4024 seconds. That is about 3 hosts / minute compared to about 1.5 hosts/minute with indexes on the large file. I don't know if that scales up to the large file but it still seems like a long time to load 215 hosts, 67 minutes.

Hmmm.....

ebdavison commented 10 years ago

Any thoughts?

hammackj commented 10 years ago

I can do some testing to see how long it takes to add the indexes after the fact. I am pretty sure it will take the same amount of time.

ebdavison commented 10 years ago

Ran another load. This one was 122Mb, took 3.5 hours(12600 seconds) (with indexes) and covered 797 hosts.

hammackj commented 9 years ago

Closing this for now. Added speed performance updates for 1.7.1+ on my todo list.