iam-py-test / my_filters_001

My filter lists - feel free to add these lists to uBlock Origin
https://iam-py-test.github.io/my_filters_001/
Creative Commons Zero v1.0 Universal
62 stars 9 forks source link

[bug] Pihole reports some entries are not domains #116

Closed pallebone closed 1 year ago

pallebone commented 1 year ago

What extention/adblocker/firewall do you have?

Pihole

What list(s) are you using?

The malicious website blocklist

What is the issue?

Pihole reports this when using the list:

[i] Target: https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp.txt [✓] Status: Retrieval successful [✓] Parsed 0 exact domains and 9784 ABP-style domains (ignored 509 non-domain entries) Sample of non-domain entries:

509 non domain entries exist. Is this a bug or is this correct (a feature)?

Kind regards Peter

github-actions[bot] commented 1 year ago

Hi, @pallebone! Thank you for opening an issue!
@iam-py-test will get to this as soon as possible

iam-py-test commented 1 year ago

The ABP version is designed for AdBlock Plus, which supports $ values (i.e. 3p) and IP addresses. PiHole seems to not know what to do with them, thus the error. There is a pure domains version I originally intended for PiHole (and other similar software), but given PiHole supports ABP/uBo syntax minus modifiers (the $ stuff), I could create an ABP version for it which has $'s and IP's stripped out. All of this is complicated by the fact that I lack an install of PiHole. I am going to do a little research to confirm my hypothesis, and work on creating a PiHole-specific ABP-style list (which should be easy once I know how to do it properly).

pallebone commented 1 year ago

Interesting, thank you for this information. I did not know ABP supports IP's. That is interesting information.

Kind regards Peter

iam-py-test commented 1 year ago

Well, it doesn't support IPs, it just treats them as domains. But PiHole clearly can't do that. Thank you for reporting this.

pallebone commented 1 year ago

Interesting, what is the use of treating an IP like it was a DNS entry? I find this a confusing feature. If no DNS lookup is made, what is the point?

iam-py-test commented 1 year ago

I'm sorry, but I'm confused. If you are talking about ABP, it filters at the browser level not DNS.

pallebone commented 1 year ago

I see what you mean. Makes sense.

iam-py-test commented 1 year ago

Can you delete the old list and try https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp_domainsonly.txt? Please tell me if there are any errors so I can investigate further. Thanks

pallebone commented 1 year ago

Thank you for this new list. It is appreciated. There are still some errors... can you check?

[i] Target: https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp_domainsonly.txt [✓] Status: Retrieval successful [✓] Parsed 0 exact domains and 8950 ABP-style domains (ignored 6 non-domain entries) Sample of non-domain entries:

Kind regards Peter

iam-py-test commented 1 year ago

Thanks. That's an easy fix, luckily. Edit: Turns out fixing this wasn't as easy as it looked. Stand by, I'm investigating further

pallebone commented 1 year ago

Thank you for investing the time to investigate.

iam-py-test commented 1 year ago

Sadly, the library I use (idna) can't handle encoding these domains, so my code "fails softly" and just returns the encoded versions. I could:

pallebone commented 1 year ago

Are these real domains? Ie is keentürkiye.com a real domain and Pihole is at fault (ie log a bug with pihole) or is the domain fake and thus invalid (Pihole is correct)?

Kind regards Peter

iam-py-test commented 1 year ago

I'm not sure about those specific domains (they don't resolve), but there are valid domains with those characters in them. As to if PiHole is working as it shouldn't, I'm afraid I lack the knowledge to answer that. I think I have found a solution, but I'm not a 100% comfortable with it

iam-py-test commented 1 year ago

Can you update the list and check if there are any errors? Thanks

pallebone commented 1 year ago

There was an improvement so I think you fixed something but some other domain got flagged ip for a different reason I think:

[i] Target: https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp_domainsonly.txt [✓] Status: Retrieval successful [✓] Parsed 0 exact domains and 8957 ABP-style domains (ignored 2 non-domain entries) Sample of non-domain entries:

pallebone commented 1 year ago

I think they are supposed to have ||

iam-py-test commented 1 year ago

Those lines are complete filters, they just have spaces in them. It seems PiHole is ignoring everything before the space, and thus making the entry invalid. I will have to get PiHole setup and test. Thanks

iam-py-test commented 1 year ago

Well, this seems to be an issue with the function I'm using to encode these; it adds spaces into the domains, and spaces aren't valid in domains. 🤦‍♀️ Well, back to square one. It's too late today for me to look further into this, so just going to have the script ignore domains it can't encode until I find a better solution.

pallebone commented 1 year ago

Thats weird there were only 2 causing an issue. It looked like you almost fixed it.

pallebone commented 1 year ago

Sometimes its good just to have a rest ❤️ Have a good sleep :)

pallebone commented 1 year ago

Im closing this issue, apologies for wasting your time. You can see my comment on the other ticket.

Kind regards Peter

pallebone commented 1 year ago

Sorry just want to ask, will I still be able to use this list: https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp_domainsonly.txt

Or do you not intend to keep it?

Kind regards Peter

iam-py-test commented 1 year ago

I intend to keep it. It still has value to PiHole users (other than those few domains I couldn't get PiHole to be happy with, the rest of the list is still there), and there's pretty much 0 cost in terms of maintenance as it's just a script which automatically runs.

pallebone commented 1 year ago

Ok that is great, the list is currently working without any errors:

[i] Target: https://raw.githubusercontent.com/iam-py-test/my_filters_001/main/Alternative%20list%20formats/antimalware_abp_domainsonly.txt [✓] Status: Retrieval successful [✓] Parsed 0 exact domains and 8962 ABP-style domains (ignored 0 non-domain entries) [i] List stayed unchanged

Thank you.

P