Closed ppfeufer closed 1 year ago
Verbose log:
023/07/12 20:19:55.609026 4179874#71 [debug] started POST 138.201.77.133:8100 /control/filtering/set_url
2023/07/12 20:19:55.609319 4179874#71 [debug] filtering: set name to "[GitHub] ppfeufer/adguard-filter-list", url to https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist, enabled to true for filter https://github.com/ppfeufer/adguard-filter-list/blob/master/blocklist?raw=true
2023/07/12 20:19:55.609540 4179874#71 [debug] filtering: downloading update for filter 1642338271 from "https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist"
2023/07/12 20:19:55.609749 4179874#64 [debug] home: customdial: dialing addr "raw.githubusercontent.com:443" for network tcp
2023/07/12 20:19:55.609877 4179874#82 [debug] dnsproxy: cache: serving cached response
2023/07/12 20:19:55.609980 4179874#81 [debug] dnsproxy: cache: serving cached response
2023/07/12 20:19:55.610162 4179874#64 [debug] dnsServer.Resolve: "raw.githubusercontent.com": [{185.199.108.133 } {185.199.109.133 } {185.199.110.133 } {185.199.111.133 } {2606:50c0:8000::154 } {2606:50c0:8001::154 } {2606:50c0:8002::154 } {2606:50c0:8003::154 }]
2023/07/12 20:19:55.640904 4179874#71 [debug] filtering: filter 1642338271 from url "https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist" has no changes, skipping
2023/07/12 20:19:55.641207 4179874#71 [error] POST 138.201.77.133:8100 /control/filtering/set_url: scanning filter contents: bufio.Scanner: token too long
2023/07/12 20:19:55.641338 4179874#71 [debug] finished POST 138.201.77.133:8100 /control/filtering/set_url in 32.290566ms
And when trying to add it as new blocklist:
2023/07/12 20:22:53.763555 4179874#132 [debug] started POST 138.201.77.133:8100 /control/filtering/add_url
2023/07/12 20:22:53.763801 4179874#132 [debug] filtering: downloading update for filter 1689185952 from "https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist"
2023/07/12 20:22:53.764040 4179874#134 [debug] home: customdial: dialing addr "raw.githubusercontent.com:443" for network tcp
2023/07/12 20:22:53.764157 4179874#135 [debug] dnsproxy: cache: serving cached response
2023/07/12 20:22:53.764252 4179874#136 [debug] dnsproxy: cache: serving cached response
2023/07/12 20:22:53.764315 4179874#134 [debug] dnsServer.Resolve: "raw.githubusercontent.com": [{185.199.108.133 } {185.199.109.133 } {185.199.110.133 } {185.199.111.133 } {2606:50c0:8000::154 } {2606:50c0:8001::154 } {2606:50c0:8002::154 } {2606:50c0:8003::154 }]
2023/07/12 20:22:53.796294 4179874#132 [debug] filtering: filter 1689185952 from url "https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist" has no changes, skipping
2023/07/12 20:22:53.796391 4179874#132 [error] filtering: os.Chtimes(): chtimes /opt/AdGuardHome/data/filters/1689185952.txt: no such file or directory
2023/07/12 20:22:53.796585 4179874#132 [error] POST 138.201.77.133:8100 /control/filtering/add_url: Couldn't fetch filter from URL "https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist": scanning filter contents: bufio.Scanner: token too long
2023/07/12 20:22:53.796629 4179874#132 [debug] finished POST 138.201.77.133:8100 /control/filtering/add_url in 33.089971ms
Thanks for the report. We've introduced an optimization that limits the RAM consumed by the update check by limiting the length of a single rule to 1024 bytes, and it seems like your list has 66 rules longer than that:
grep -e '^.\{1024,\}' -- ./blocklist | wc
Moreover, neither of these rules seem to be DNS rules, mostly being content-blocking rules. You can filter them out with a script like:
sed '/^.\{1024,\}/d' ./blocklist > ./blocklist_dns
Ah, I see. I'll try that.
Success!
After tweaking the transformation option of my hostlist-compiler settings it's all working again. Thanks for the quick answer and the hint!
感谢您的报告。我们引入了一项优化,通过将单个规则的长度限制为 1024 字节来限制更新检查消耗的 RAM,您的列表似乎有 66 条规则比这长:
grep -e '^.\{1024,\}' -- ./blocklist | wc
此外,这些规则似乎都不是DNS规则,主要是内容阻止规则。您可以使用如下脚本过滤掉它们:
sed '/^.\{1024,\}/d' ./blocklist > ./blocklist_dns
Since updating to v0.107.34, I have encountered this error. I subscribed to someone else's rules, so what should I do?
Error: control/filtering/add_url | Couldn't fetch filter from URL "https://raw.gitmirror.com/monsm/XXKiller/main/x.txt": line at index 44290: character at index 91: non-printable character | 400 @ainar-g what should I do
Ask the maintainer of that list to use HostListCompiler and apply the Validate
transformation filter, that's the easiest way to generate compatible lists and what fixed my issue.
Example: https://github.com/ppfeufer/adguard-filter-list/blob/master/hostlist-compiler-config.json
Since quite a number of filter lists are used with both, AdGuardHome and ad-blocker extensions for browsers (µblock, Adguard, etc.), I guess we'll see this issue popping up for a number of these lists.
Thank you, it seems that the rule maintainer can only make the changes.
@ppfeufer Help me see how to implement it with the HostListCompiler,https://github.com/monsm/XXKiller/blob/mae/RMaker/make.cmd
All can be found here » https://github.com/ppfeufer/adguard-filter-list
@ppfeufer Please check my revision to see if there are any mistakes,thinks https://[raw.githubusercontent.com/monsm/XXKiller/mae/.github/workflows/xxkiller.yml](https://raw.githubusercontent.com/monsm/XXKiller/mae/.github/workflows/xxkiller.yml) https://[raw.githubusercontent.com/monsm/XXKiller/mae/RMaker/make.cmd](https://raw.githubusercontent.com/monsm/XXKiller/mae/RMaker/make.cmd)
This is beyond the scope and topic of this issue.
How to use the HostListCompiler is well explained in their repository (https://github.com/AdguardTeam/HostlistCompiler). Please have a look there.
Upon reinspecting the code, I think we can actually allow larger lines without losing the optimization for the most common case. We can also improve the error message as well. I'm going to reopen the issue now and commit a fix soon.
The line-length limit has been relaxed, and the error message now includes the character in question:
line 66499: character 92: non-printable character '\u200c'
The line-length limit has been relaxed, and the error message now includes the character in question:
line 66499: character 92: non-printable character '\u200c'
could the adguardHome auto fix the error,auto delete line
@monsm, from what I understand, the error is there to prevent users from putting e.g. binary files instead of text ones. There is a similar check against HTML text too. What kind of error are you getting? Perhaps the check could be relaxed.
@monsm, from what I understand, the error is there to prevent users from putting e.g. binary files instead of text ones. There is a similar check against HTML text too. What kind of error are you getting? Perhaps the check could be relaxed.
zwnj & zwsp error in rules,But I don't know how to remove the unsupported lines from the rules
@ainar-g Does this submission make a relaxed judgment about zwnj, zwsp, or other special characters? What is the 1024 byte length limit now? https://github.com/AdguardTeam/AdGuardHome/commit/2adc8624c0bd589a9efab564297bab77dde17ac8
@monsm, yes, and we have added test cases for that to make sure that they keep working. The hard line-length limit has been returned to 64 KiB.
Hello Will there be a fix for this issue? I am receiving "Error: control/filtering/set_url | scanning filter contents: bufio.Scanner: token too long | 400" when trying to access the following filter https://raw.githubusercontent.com/AdguardTeam/AdguardFilters/master/MobileFilter/sections/specific_app.txt
@fbaijnauth, please read above. The fix is already on the Edge channel. The README has instructions on testing the Edge and Beta versions. (Do not forget to backup your configuration.)
thank you
@ainar-g May I ask, when will the stable version of v0.107.35 be released?
@Jefffish09, about 15 minutes ago, heh.
Prerequisites
[X] I have checked the Wiki and Discussions and found no answer
[X] I have searched other issues and found no duplicates
[X] I want to report a bug and not ask a question or ask for help
[X] I have set up AdGuard Home correctly and configured clients to use it. (Use the Discussions for help with installing and configuring clients.)
Platform (OS and CPU architecture)
Linux/ARM64
Installation
GitHub releases or script from README
Setup
On one machine
AdGuard Home version
v0.107.34
Action
Trying to update my blocklist via the UI.
Expected result
Blocklist updating successfully.
Actual result
Additional information and/or screenshots
This is a blocklist I have been using for a long time, and after today's update, I noticed that it is mentioned with 0 entries.
So I tried to update it manually by editing and saving, which resulted in this error message.
Blocklist URL: https://raw.githubusercontent.com/ppfeufer/adguard-filter-list/master/blocklist