luxysiv / Cloudflare-Gateway-Pihole

Make ad blocking dns using Cloudflare Gateway Zero Trust
84 stars 142 forks source link

URL scraping error #43

Closed LexterS999 closed 3 months ago

LexterS999 commented 4 months ago

2024-06-20 05:47:53.141 | INFO | src:info:89 - Number of whitelisted domains: 5 2024-06-20 05:47:58.193 | INFO | src:info:89 - Number of blocked domains: 253825 2024-06-20 05:47:58.289 | INFO | src:info:89 - Number of final domains: 253822 2024-06-20 05:47:58.650 | INFO | src:info:89 - Total lists on Cloudflare: 118 2024-06-20 05:47:58.650 | INFO | src:info:89 - Total domains on Cloudflare: 117274 2024-06-20 05:47:58.822 | INFO | src:info:89 - Total chunked lists generated: 254 2024-06-20 05:47:59.018 | INFO | src:info:89 - Updating list [AdBlock-DNS-Filters] - 001 2024-06-20 05:47:59.757 | INFO | src:info:89 - Updating list [AdBlock-DNS-Filters] - 002 2024-06-20 05:48:00.767 | INFO | src:info:89 - Updating list [AdBlock-DNS-Filters] - 003 2024-06-20 05:48:01.835 | INFO | src:info:89 - Updating list [AdBlock-DNS-Filters] - 004 2024-06-20 05:48:02.890 | INFO | src:info:89 - Updating list [AdBlock-DNS-Filters] - 005 2024-06-20 05:48:03.408 | INFO | src:info:89 - Retrying (1): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:03.408 | INFO | src:info:89 - Sleeping before next retry (1) 2024-06-20 05:48:04.884 | INFO | src:info:89 - Retrying (2): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba[85](https://github.com/asterriya/Cloudflare-Gateway-Pihole/actions/runs/9592280677/job/26450546403#step:5:86)8bc178 2024-06-20 05:48:04.884 | INFO | src:info:89 - Sleeping before next retry (2) 2024-06-20 05:48:06.595 | INFO | src:info:89 - Retrying (3): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:06.596 | INFO | src:info:89 - Sleeping before next retry (3) 2024-06-20 05:48:08.272 | INFO | src:info:89 - Retrying (4): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/[90](https://github.com/asterriya/Cloudflare-Gateway-Pihole/actions/runs/9592280677/job/26450546403#step:5:91)cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:08.272 | INFO | src:info:89 - Sleeping before next retry (4) 2024-06-20 05:48:11.663 | INFO | src:info:89 - Retrying (5): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:11.663 | INFO | src:info:89 - Sleeping before next retry (5) 2024-06-20 05:48:22.166 | INFO | src:info:89 - Retrying (6): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:22.166 | INFO | src:info:89 - Sleeping before next retry (6) 2024-06-20 05:48:32.694 | INFO | src:info:89 - Retrying (7): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:32.694 | INFO | src:info:89 - Sleeping before next retry (7) 2024-06-20 05:48:43.140 | INFO | src:info:89 - Retrying (8): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:43.141 | INFO | src:info:89 - Sleeping before next retry (8) 2024-06-20 05:48:44.977 | INFO | src:info:89 - Retrying (9): 400 Client Error: Bad Request for url: https://api.cloudflare.com/client/v4/accounts/***/gateway/lists/90cfd341-8f9a-40b3-aa8b-efba858bc178 2024-06-20 05:48:44.978 | INFO | src:info:89 - Sleeping before next retry (9)

luxysiv commented 4 months ago

Script works fine. See https://github.com/luxysiv/Cloudflare-Gateway-Pihole/actions/runs/9592784250/job/26451963354

LexterS999 commented 4 months ago

No, it's not. You can check yourself those lists.

[Ad-Urls] 1 = https://git.herrbischoff.com/trackers/plain/trackers.txt 2 = https://git.herrbischoff.com/trackers/plain/telemetry.txt 3 = https://raw.githubusercontent.com/Retold3202/BadBlock/main/wildcards-no-*/badblock_plus.txt 4 = https://raw.githubusercontent.com/TG-Twilight/AWAvenue-Ads-Rule/main/AWAvenue-Ads-Rule.txt 5 = https://ente.dev/api/blocklist/advertising-hosts 6 = https://ente.dev/api/blocklist/tracking-hosts 7 = https://divested.dev/blocklists/Mobile.txt 8 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_malware 9 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_malware_typo 10 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_0-9- 11 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing__subdomains 12 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_a-d- 13 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_ad- 14 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_appnexus 15 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_click- 16 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_doobleclick 17 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_e-g- 18 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_h-k- 19 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_l-n- 20 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_mailru 21 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_marketgid 22 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_o-q- 23 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_r-t- 24 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_u-w- 25 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_x-z- 26 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_yahoo 27 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_marketing_yandex 28 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_telemetry 29 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_telemetry_microsoft 30 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_clouds_cloudfront 31 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_cn 32 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_dyndns 33 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_0-9 34 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_a-b 35 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_c-d 36 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_e-g 37 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_h-k 38 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_l-n 39 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_o-q 40 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_r-s 41 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_t-w 42 = https://raw.githubusercontent.com/mtxadmin/ublock/master/hosts/_trash_random_x-z 43 = https://raw.githubusercontent.com/arcestia/blocklists/main/data/easylist-rus/list.txt 48 = https://raw.githubusercontent.com/masterinspire/filter-lists/main/filter-lists.txt 53 = https://raw.githubusercontent.com/ph00lt0/blocklists/master/blocklist.txt 54 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_1.txt 55 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_24.txt 56 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_2.txt 59 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_3.txt 60 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_6.txt 61 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_7.txt 64 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_30.txt 65 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_12.txt 66 = https://adguardteam.github.io/HostlistsRegistry/assets/filter_11.txt 67 = https://easylist-downloads.adblockplus.org/easylist.txt 68 = https://easylist-downloads.adblockplus.org/easyprivacy.txt 69 = https://filters.adtidy.org/extension/ublock/filters/224.txt 70 = https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_ads.txt 71 = https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_clickthroughs.txt 72 = https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_microsites.txt 73 = https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_trackers.txt 74 = https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_mail_trackers.txt

luxysiv commented 4 months ago

The problem is that one of the above lists has one or more domains(s) that Cloudflare does not accept

LexterS999 commented 4 months ago

Hm, that's weird. Because most of them in domain format and some in adblock.

luxysiv commented 4 months ago

Hm, that's weird. Because most of them in domain format and some in adblock.

Problem belonging to Cloudflare. I don't know.

luxysiv commented 4 months ago

Issue

2024-06-27 04:40:12.926 | ERROR    | src:error:65 - Request failed with 400 Bad Request: {
  "result": null,
  "success": false,
  "errors": [
    {
      "message": "item: '84.54.0.51.78' error: 84.54.0.51.78 is an invalid domain name"
    }
  ],
  "messages": []
}

Please check lists, so I can improve convert to domain

luxysiv commented 3 months ago

Sorry, that issue belonging to pattern not good. I will be fix soon @LexterS999

LexterS999 commented 3 months ago

@luxysiv Take a look at my fork https://github.com/LexterS999/Cloudflare-Gateway-Pihole. There are a lot of changes.

luxysiv commented 3 months ago

You can make a readme your native language in docs, I will merge

LexterS999 commented 3 months ago

@luxysiv Just look at code changes.

luxysiv commented 3 months ago

@luxysiv Just look at code changes.

Where?

LexterS999 commented 3 months ago

Inside workflow file and inside src folder

вс, 21 июл. 2024 г., 13:31 Mạnh Dương @.***>:

@luxysiv https://github.com/luxysiv Just look at code changes.

Where?

— Reply to this email directly, view it on GitHub https://github.com/luxysiv/Cloudflare-Gateway-Pihole/issues/43#issuecomment-2241526834, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGBKUD2MZIHLDDGH4OZGLJDZNNWVXAVCNFSM6AAAAABJTHL3NWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBRGUZDMOBTGQ . You are receiving this because you were mentioned.Message ID: @.***>

luxysiv commented 3 months ago

Ok,I will see it

luxysiv commented 3 months ago

First: You can use Cloudflare Workers Pages to control Github Action Second: Script no need requests and colorlog