BUG: shadowserver parser and harmonisation are too complex

certtools / intelmq

IntelMQ is a solution for IT security teams for collecting and processing security feeds using a message queuing protocol.

https://docs.intelmq.org/latest/

GNU Affero General Public License v3.0

973 stars 295 forks source link

BUG: shadowserver parser and harmonisation are too complex #591

Open aaronkaplan opened 8 years ago

aaronkaplan commented 8 years ago

subject says it all. This is not what we intended to have. Re-do in version 2.

ghost commented 7 years ago

Can you explain? The current solution allows easy upgrading and deployment. Changes at shadowserver can be easily addressed by a change in our code and deployed at the installations, that's quiet easy for easy for users I think.

aaronkaplan commented 4 years ago

yes, I can explain.

So I looked at this code some years ago, and back then it was definitely too complex. Looked at it again and I still think it's too complex. I got better for sure (when compared to the first versions) but I still think it has too much code in one place to deal with the complexity of parsing shadowserver data.

Here is an experiment you can do: I think the best to experience this is to try to teach intelmq and to explain to someone else what this particular parser does. And see their eyes get glazed... If they did not get it in the first 5 mins, it's too complex. You can counter that by breaking up the functionality into clear , easy and simple code blocks and write a small README.md file with a picture of how the code gets connected together. For example.

Assume an average beginning python coder. Even though python allows for really cools stuff (metaprogramming etc) I still believe IntelMQ needs to be totally readable for a non-daily python coder. That's hard to see as a daily python coder....

aaronkaplan commented 4 years ago

or... we can have a good README.md on how to use the SS parsers. With examples. In that case, people don't care how complex the code is, as long as they a) can use it and understood the examples and b) the code does not throw exceptions and behaves as they would expect it to.

That's the other option I see to counter the complexity.

ghost commented 4 years ago

or... we can have a good README.md on how to use the SS parsers.

The usage documentation can be found here: https://github.com/certtools/intelmq/blob/develop/docs/Bots.md#shadowserver

ghost commented 3 years ago

Hi,

given that I have just implemented #1700 and had base that upon the existing Shadowserver parser, I have to agree that there is potential for improvement regarding the readability and the structure of the code. Especially now, that there are two parsers that in parts use the same functionality.

It might be a good idea to move some of the (now duplicate) code from the parsers to the config.py file (i.e. everythin from line L116 to L200 in https://github.com/certtools/intelmq/blob/8e0142cc876c8298a0cb113c3b92b4c1671f8893/intelmq/bots/parsers/shadowserver/parser.py).

I would change the config.py to be a real Python class Config and have the conversion functionality be part of that class. Depending on how much we want to change, it might also make sense to rename it to Format or Converter, because its not really a configuration;