Closed thomasmerz closed 2 years ago
Checks won't fail any more after PR #37 has beeen finished/doneβ¦
but it doesn't say command not found anymore - problem/issue solved wink
I'm not convinced if this is really a working solution. If it takes hours this is nothing I'd like to ship to users
Does it really work?
I did not test it, but expect it to fail also here: https://github.com/yubiuser/pihole_adlist_tool/blob/890f454c781c32d984c73f0bd8494aa2d9be2770/pihole_adlist_tool#L1006
Does it really work?
I did not test it, but expect it to fail also here:
I reviewed and a fix will comeβ¦
I'm not convinced if this is really a working solution. If it takes hours this is nothing I'd like to ship to users
You are right, do not merge this until we have an optimized version! I have some ideas but have to test and checkβ¦
β οΈ Review and diff should be done when PR #37 has been merged because this PR depends on it! β οΈ
π‘ This is working with limit of 2000 (distinct) domains, but limited due to "Argument list too long". This array should be splitted by for example 500 domains and regex-tested in packagesβ¦
mapfile -t all_domains < <(sqlite $TEMP_DB "SELECT domain FROM all_domains")
- for CURRENT_DOMAIN in "${all_domains[@]}"; do
if [ "$PIHOLE_DOCKER" = 0 ];
then
- pihole-FTL regex-test $CURRENT_DOMAIN |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read REGEX_ID; do
- sqlite $TEMP_DB "INSERT INTO domain_by_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
+ for CURRENT_DOMAIN in "${all_domains[@]}"; do
+ pihole-FTL regex-test "$CURRENT_DOMAIN" |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
+ sqlite $TEMP_DB "INSERT INTO domain_by_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
+ done
done
else
- docker exec $CONTAINER_ID pihole-FTL regex-test $CURRENT_DOMAIN |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read REGEX_ID; do
+ ### TODO: remove limit!!!! ####### TODO ####### TODO #######
+ docker exec "$CONTAINER_ID" bash -c \
+ "$(for CURRENT_DOMAINS in $(sqlite $TEMP_DB 'SELECT domain FROM all_domains limit 2000'); do
+ echo pihole-FTL regex-test "$CURRENT_DOMAINS" ; done)" \
+ |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
sqlite $TEMP_DB "INSERT INTO domain_by_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
done
+ #docker exec "$CONTAINER_ID" pihole-FTL regex-test "$CURRENT_DOMAIN" |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
+ # sqlite $TEMP_DB "INSERT INTO domain_by_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
+ #done
+ ### TODO: this really needs some performance optimization to avoid 1000s of docker-execs
+ :
fi
- done
# count for each RegEx_id how many domains are in domain_by_regex and store it in table regex_black
@@ -1014,9 +1023,17 @@ EOF
mapfile -t all_exact_domains < <(sqlite $GRAVITY "SELECT domain FROM domainlist WHERE type in (0,1);")
for CURRENT_DOMAIN in "${all_exact_domains[@]}"; do
- pihole-FTL regex-test "$CURRENT_DOMAIN" |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
- sqlite $TEMP_DB "INSERT INTO domainlist_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
- done
+ if [ "$PIHOLE_DOCKER" = 0 ];
+ then
+ pihole-FTL regex-test "$CURRENT_DOMAIN" |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
+ sqlite $TEMP_DB "INSERT INTO domainlist_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
+ done
+ else
+ docker exec "$CONTAINER_ID" pihole-FTL regex-test "$CURRENT_DOMAIN" |grep -E -o "DB ID [0-9]*"|awk '{print $3}' | while read -r REGEX_ID; do
+ sqlite $TEMP_DB "INSERT INTO domainlist_regex(domain, regex_id) VALUES ('$CURRENT_DOMAIN',$REGEX_ID);"
+ done
+ ### TODO: may need some performance optimization to avoid so much docker-execs
+ fi
[i] Since 01.02.2022 0:58:13 522 different domains from your adlists have been blocked 15680 times in total
(blocked directly by gravity or during deep CNAME inspection)
[i] Using you current adlist configuration 546 domains would have been blocked 16606 times
[i] Adlist coverage (sum of "hits covered" ca. 47.000)
[i] In total your adlists contain 189 visited (covered) unique domains - meaning those domains are contained only in a single adlist.
β¦
[i] Since 01.02.2022 0:58:13 you have been visiting 2980 different domains.
You have 22 blacklist RegEx configured (20 enabled)
With your enabled blacklist RegEx you would have covered 1 different domains.
[i] Please note: the internal Pi-hole RegEx test used here only checks domains against enabled RegEx.
Therefor, currently disabled RegEx will always have 0 domains covered.
Which counts are relevant for the regex-mode?
All regex (number not shown) are checked against the 2980 distinct domains
@yubiuser , I changed my commit / PR:
This PR will disable regex-analysis when Pi-hole is running in a docker container and tells user about this.
So Regex analysis is still not supported on docker, but script isn't doing anything worthless/useless stuff anymore. Do you agree?
I also added a .stickler.yml π
The difficulty is that we have an unknown and possibly very big list of domains that we have to pass to the docker container for regex-checking. I really have no clue how to solve this efficientlyβ¦ π€·π»ββοΈ
I agree that we should probably disable regex-checking if Pi-hole is running on docker as long as we do not have a better solution.
Please remove the stickler stuff from this PR. I'm happy to include it, but don't mix unrelated changes in one PR
I couldn't find a .stickler.yml
file in this repository. I can make one for you, or you can create one by following the documentation.
Is this PR now ready for a merge? Or do you want a single commit rebased to commit 890f454c ?
I guess best would be a rebase on current development. Please also target development. Do not modify .github/workflows/shellcheck.yml
in this PR, draft a new one for this. One PR for one issue.
.github/workflows/shellcheck.yml
(and .stickler.yml
as well)$ act
[Shellcheck Lint/Shellcheck Lint] π Start image=ghcr.io/catthehacker/ubuntu:act-latest
[Shellcheck Lint/Shellcheck Lint] π³ docker pull image=ghcr.io/catthehacker/ubuntu:act-latest platform= username= forcePull=false
[Shellcheck Lint/Shellcheck Lint] π³ docker create image=ghcr.io/catthehacker/ubuntu:act-latest platform= entrypoint=["/usr/bin/tail" "-f" "/dev/null"] cmd=[]
[Shellcheck Lint/Shellcheck Lint] π³ docker run image=ghcr.io/catthehacker/ubuntu:act-latest platform= entrypoint=["/usr/bin/tail" "-f" "/dev/null"] cmd=[]
[Shellcheck Lint/Shellcheck Lint] π³ docker exec cmd=[mkdir -m 0777 -p /var/run/act] user=root workdir=
[Shellcheck Lint/Shellcheck Lint] π³ docker cp src=/home/thomas/temp/PRs/pihole_adlist_tool/. dst=/home/thomas/temp/PRs/pihole_adlist_tool
[Shellcheck Lint/Shellcheck Lint] π³ docker exec cmd=[mkdir -p /home/thomas/temp/PRs/pihole_adlist_tool] user= workdir=
[Shellcheck Lint/Shellcheck Lint] β Run actions/checkout@v2
[Shellcheck Lint/Shellcheck Lint] β
Success - actions/checkout@v2
[Shellcheck Lint/Shellcheck Lint] β Run Download Shellcheck
[Shellcheck Lint/Shellcheck Lint] π³ docker exec cmd=[bash --noprofile --norc -e -o pipefail /home/thomas/temp/PRs/pihole_adlist_tool/workflow/1] user= workdir=
| shellcheck-stable/LICENSE.txt
| shellcheck-stable/README.txt
| shellcheck-stable/shellcheck
[Shellcheck Lint/Shellcheck Lint] β
Success - Download Shellcheck
[Shellcheck Lint/Shellcheck Lint] β Run Check Shellcheck Version
[Shellcheck Lint/Shellcheck Lint] π³ docker exec cmd=[bash --noprofile --norc -e -o pipefail /home/thomas/temp/PRs/pihole_adlist_tool/workflow/2] user= workdir=
| ShellCheck - shell script analysis tool
| version: 0.8.0
| license: GNU General Public License, version 3
| website: https://www.shellcheck.net
[Shellcheck Lint/Shellcheck Lint] β
Success - Check Shellcheck Version
[Shellcheck Lint/Shellcheck Lint] β Run Run Shellcheck
[Shellcheck Lint/Shellcheck Lint] π³ docker exec cmd=[bash --noprofile --norc -e -o pipefail /home/thomas/temp/PRs/pihole_adlist_tool/workflow/3] user= workdir=
[Shellcheck Lint/Shellcheck Lint] β
Success - Run Shellcheck
-r
π
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
23:05 $ ./pihole_adlist_tool-fix_issue_34 -d 2 -t 100 -r -a > new.txt
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
23:06 $ ./pihole_adlist_tool-development -d 2 -t 100 -r -a > old.txt
./pihole_adlist_tool-development: line 991: pihole-FTL: command not found
β¦
./pihole_adlist_tool-development: line 991: pihole-FTL: command not found
^C./pihole_adlist_tool-development: line 991: pihole-FTL: command not found
π¦π₯ ^C
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
23:06 $ !sd
β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
23:06 $ sdiff -s old.txt new.txt
*** Pihole Adlist Tool 2.6.1 *** | *** Pihole Adlist Tool 2.6.X ***
[i] REGEX_MODE: Enabled | [i] REGEX_MODE: Disabled due to running Pi-hole in Docker
[i] Since 18.02.2022 23:06:35 893 different domains | [i] Since 18.02.2022 23:06:06 893 different domains
Those would have been the 100 top blocked adlist do | Those would have been the 100 top blocked adlist do
<
<
[i] Analysing RegEx ..... <
[i] This might take some time (minutes!) - please be patie <
<
<
<
<
[β] User-abort detected! <
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
23:06 $
Whats's about the changes in line 859 and above?
What changes do you mean? π€·π»ββοΈ git diff development
shows me these changes:
Is there anything "wrong" or are you missing anything? If this is the case, it might be my fault while rebasing and resolving some merge conflictsβ¦
Something went wrong during rebase I guess. These changes were introduced in your other PR but later reverted.
I fixed the rebase now. git diff
now shows only the "do not RegEx when running in docker container" parts for issue #34 - do you want another squash of all commits to a single one before merging?
I'll add a final test later (in/after the lunch-break possibly)β¦
I just decided that it looks Ok ;-)
But will wait for your feedback before merging to master
Short coffee-break before next meeting π
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦2]
11:26 $ ./pihole_adlist_tool -d 2 -t 100 -r -a > new_with_r.txt
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦3]
11:30 $ ./pihole_adlist_tool -d 2 -t 100 -a > new_without_r.txt
sdiff -s new_with_r.txt new_without_r.txtπ¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
11:30 $ sdiff -s new_with_r.txt new_without_r.txt
[i] REGEX_MODE: Disabled due to running Pi-hole in Docker | [i] REGEX_MODE: Disabled
[i] Since 26.02.2022 11:30:20 982 different domains | [i] Since 26.02.2022 11:30:45 982 different domains
[i] Using you current adlist configuration 998 domains| [i] Using you current adlist configuration 998 domains
Those would have been the 100 top blocked adlist do | Those would have been the 100 top blocked adlist do
stats.gc.fe.apple-dns.net 2225 | stats.gc.fe.apple-dns.net 2223
stats.gc-apple.com.akadns.net 727 | stats.gc-apple.com.akadns.net 725
63 1 407420 697 38098 11 | 63 1 407420 697 38094 11
64 1 99381 28 1533 16 | 64 1 99381 28 1531 16
π¦π₯ β ~/temp/PRs/pihole_adlist_tool [fix_issue_34|β¦4]
11:31 $
This PR will fix issue #34 by disabling regex-check when running Pi-hole in Docker Container π To be honest to the user, we will print this info in options and when running ππ»