intentionet / netconan

netconan - a Network Configuration Anonymizer
Apache License 2.0
145 stars 12 forks source link

Add option to remove "comments" from configs #134 #137

Closed tandreadi closed 3 years ago

tandreadi commented 4 years ago

Implementing this by using a list of keywords given by the user, separated by commas, and removing any line that contains any of these words. The removal is preserved if the line contains any of the words defined in default_reserved_words.py file.

Here is an example:

Running the following command:

netconan -i cisco.cfg -o result.cfg -k neighbor,keyword

this warning will appear:

WARNING Specified keywords overlap with reserved words. The following reserved words will be preserved: {'neighbor-down', 'neighbor-advertisement', 'log-neighbor-warnings', 'auto-shutdown-new-neighbors', 'log-neighbor-changes', 'neighbor-filter', 'no-neighbor-learn', 'neighbor-group', 'neighbor-solicit', 'neighbor', 'remote-neighbors', 'neighbor-discovery', 'no-neighbor-down-notification'}

The cisco.cfg file is:

initial_file

And the result.cfg file will be:

result

Could you give me some feedback about my work so far? @dhalperi


This change is Reviewable

sfraint commented 4 years ago

Hi @tandreadi thank you for your contribution! This has been a busy week, but I will try to make some time to review this next week.

tandreadi commented 4 years ago

Hey @sfraint! Is there any update?

tandreadi commented 4 years ago

netconan/sensitive_item_removal.py, line 216 at r1 (raw file):

Previously, sfraint (Spencer Fraint) wrote…
> ``` > for w in words: > if w in self.reserved_words: > return line > ``` I think this will ultimately need to be removed. Right now, the default set of `reserved` words is made up of keywords supported in various network config syntaxes (e.g. contains `set`, `interface`, `description`, `ip`, `acl`, `neighbor`...) This means a user will not be able to remove lines like those posed in the [original issue](https://github.com/intentionet/netconan/issues/134), e.g. ``` set interfaces description "My sensitive text here" ``` since this line contains multiple `reserved` keywords

I was also concerned about this and I had in mind that the current implementation might not meet the issue's requirements. So, if the user types the option "remove-lines", each line that consists of the given words will be removed, whether or not, they belong to the reserved words. Did I understand it correctly?

tandreadi commented 4 years ago

netconan/sensitive_item_removal.py, line 204 at r1 (raw file):

Previously, sfraint (Spencer Fraint) wrote…
> ``` > def _generate_conflicting_reserved_word_list(self, keywords): > """Return a set of keywords that may conflict with the specified default reserved words.""" > conflicting_words = set() > for keyword in keywords: > conflicting_words.update(set([w for w in self.reserved_words if keyword in w])) > if conflicting_words: > logging.warning('Specified keywords overlap with reserved words. ' > 'The following reserved words will be preserved: %s', conflicting_words) > return conflicting_words > ``` This function looks similar to `_generate_conflicting_reserved_word_list` in another anonymizer; it would be good to just create a private helper function available to all anonymizers.

After the discussion about the meaning of the reserved words in this issue, I was thinking that this method in LineRemover class might be unnecessary.

However, there is the case where the user selects both options --reserved-words and --remove-lines and the given lists might have words in common.

Do you think that we have to consider this case and modify properly this method? Meaning that each line that consists of the given reserved words won't be removed.

tandreadi commented 4 years ago

netconan/sensitive_item_removal.py, line 216 at r1 (raw file):

Previously, sfraint (Spencer Fraint) wrote…
Correct

Also, I was thinking that removing this block might arise a problem in case the user gives a list of reserved words and a list of keywords. A line might contain words from both of the lists.

Assuming that in this case, we would like to prevent the removal of this line, the meaning of the instance variable reserved_words could be re-defined and refer only to the given reserved words. What's your opinion on this assumption?

dhalperi commented 4 years ago

Please reopen if you wish to continue this PR.

tandreadi commented 3 years ago

Hello @dhalperi, I'm sorry for the delay. I would like to reopen this PR but the reopen option is not available. Could you please reopen it?

dhalperi commented 3 years ago

Hey folks! Looks like this has gone idle again.

If you would like to resume, please push a commit responding to the review, and then comment here. Thanks!