blocklistproject / Lists

Primary Block Lists
The Unlicense
3.59k stars 331 forks source link

Implement whitelist automation #354

Closed fishcharlie closed 3 years ago

fishcharlie commented 3 years ago

In https://github.com/blocklistproject/Lists/issues/350#issuecomment-870478034 it came up that we should implement an automation to prevent domains that have been removed from reappearing on the list.

This issue is to track that work.

blocklistproject commented 3 years ago

Does this sound like a good way to implement this:

We create an automation that goes through each list and finds the commented out domains. Takes those domains out and puts them in a delisted file per block list. So the file structure would look like this root\lists\delisted\abuse-dl.txt.

We then modify the list create automation to check the delisted files for removed domains, if a removed domain has been readded it will be removed from the main lists so that it doesn't make it onto the other lists.

This may not be the best explanation but I have not had my caffeine yet.

fishcharlie commented 3 years ago

@blocklistproject I'm not sure that is necessary. I like to avoid creating more files if we don't have to. It just adds another layer of complexity that is normally unnecessary.

Here is what I was thinking:

  1. Modify the linter to ensure that all commented domains start with # 0.0.0.0. I think some currently start with #0.0.0.0. (This will be a PR after #370).
  2. Take every domain that starts with # 0.0.0.0, and add it to an array.
  3. For each domain that starts with 0.0.0.0, check to see if it exists within that array created in step 2.

If it does, we take ACTION.

The outstanding questions:

  1. What should ACTION be? I personally think in this case it should only throw an error or warning. I do not think it should modify the lists in this case. I think there might be cases where it's a mistake or something and having a person verify seems like the best course of action.
  2. Should step 3 check the array for every list (ie. a single global comment check) or should it be contained to the list we are checking (ie. each list has a separate commented domain check)? For example if google.com is blocked on ads.txt but exists within tracker.txt, should that produce an error or warning? I think it should be a separate commented domain check for each list, meaning the answer to the previous question would be no, it wouldn't produce an error or warning.
blocklistproject commented 3 years ago
  • What should ACTION be? I personally think in this case it should only throw an error or warning. I do not think it should modify the lists in this case. I think there might be cases where it's a mistake or something and having a person verify seems like the best course of action.

For this I think if it could create an issue that we can look into that would be great. This way it is not automatically modifying the lists.

  • Should step 3 check the array for every list (ie. a single global comment check) or should it be contained to the list we are checking (ie. each list has a separate commented domain check)? For example if google.com is blocked on ads.txt but exists within tracker.txt, should that produce an error or warning? I think it should be a separate commented domain check for each list, meaning the answer to the previous question would be no, it wouldn't produce an error or warning.

Each array should be independent of other lists because just because it was removed from one list does mean it doesn't belong on another list.

fishcharlie commented 3 years ago

@blocklistproject Awesome. Creating an issue might be unnecessary. We currently have a lint system that will fail the build if something fails. I feel like we should just use that same system for this logic. That way we can catch things in PRs before they even get merged in, as opposed to creating issues for things after the fact.

I can try to find time to work on this sometime this weekend.

blocklistproject commented 3 years ago

Sounds good. I can offer my opinion but when it comes to the automation and such I am going to defer to your expertise.