easylist / EasyListHebrew

EasyList Hebrew is a subscription that removes adverts from Hebrew webpages
64 stars 27 forks source link

Check before creating Checksum #419

Closed tzagim closed 1 month ago

tzagim commented 1 month ago

@Hebrew-uBO-User

In the current method, it checks the entire file including checksum, so in every check the result will be wrong. The new method is designed to solve this.

Changes: Correction of the test method: test from the line "! Adservers !" and below only. Checking whether the checksum is correct before creating a new one.

Fix for https://github.com/easylist/EasyListHebrew/pull/417.

PS. You cannot use the old file on the local computer, because the checksum will not be the same and it will generate a new checksum. Download the new version of validateChecksum.py & addChecksum.py.

Hebrew-uBO-User commented 1 month ago

@tzagim Thank you for your efforts!

It should check the entire file (excluding the previous checksum). We use it for AdBlock Plus (ABP) users. ABP calculates the checksum locally in the browser and compare it to the checksum we add. However, since we recommend all our users to use uBlock Origin (uBO) as their one and only adblocker (uBO ignores the checksum), and since it isn't mandatory by ABP, we can remove it from the code. We use these scripts mainly in order to update the timestamp and for sorting the list. validateChecksum.py & addChecksum.py work completely fine locally. It's not related to the date either. You can play with create.bat/create.sh and verify it.

When you edit the list on the website, I think it changes the end-of-file character. See: https://github.com/easylist/EasyListHebrew/commit/1d9b6d6e66bf756372ac8e2a7eb1877f8f1904df?diff=split&w=0 for example. Maybe that's the reason. I always update the list via uploading files.

Hebrew-uBO-User commented 1 month ago

@tzagim

I made run.bat

@echo off
cd %~p0
python .github\scripts\validateChecksum.py EasyListHebrew.txt
IF ERRORLEVEL 1 GOTO errorHandling
exit
:errorHandling
create.bat

and run.sh

#!/bin/bash
tmpList=$(mktemp)
python3 .github/scripts/validateChecksum.py EasyListHebrew.txt
if [ $? != 0 ] ; then
    source create.sh
fi

These scripts should update EasyListHebrew.txt only if the checksum is wrong.

validateChecksum.py & addChecksum.py work fine locally. But when I download EasyListHebrew.txt after "action-user" bot commit it gives me the wrong checksum, however if I download an older version of the file committed by me, the checksum is correct.

So I went to https://easylist.to/pages/other-supplementary-filter-lists-and-easylist-variants.html and downloaded some regional filter lists which include a checksum. Almost every checksum is wrong on my machine. The only checksum that is correct is of the Bulgarian list...

It happens even if I use copy-paste to a new text file or change the encoding to Unix/Windows/Mac on Notepad++.

I currently use Windows 10 and Python 3.12.7.

tzagim commented 1 month ago

@Hebrew-uBO-User

According to what is written here, it checks the whole file without the specific line of "checksum".

It should work fine now.

It seems to me that we are investing too much time in the application that, as you mentioned, it is better to use a competitor that does not require this checksum.

Hebrew-uBO-User commented 1 month ago

@tzagim Manually merged. Thank you.

Hebrew-uBO-User commented 3 weeks ago

@tzagim

  1. Now that you have fixed the checksum problem... Can you adjust the code of the bot, so it would not make a new commit if the checksum is correct.

  2. Sometimes we forget to sync the hosts files (hosts.txt and adguard_hosts.txt). Can you make an addtional bot that syncs the hosts files (using the script genhosts.py we already have). It should be run after the checksum bot, even if the checksum is correct. It should not make a new commit if the hosts files are up to date.