Closed funilrys closed 3 years ago
Mhh, it's true. It was confusing to introduce the --aggressive
argument. But as the time it was first introduced, I wasn't even sure that that thing that I engineered will be actually used.
My objective was to try to be as accurate as possible and to reduce false positives when people are using the output of PyFunceble directly into their workflow ...
The AdBlock decoder itself is self-engineered. So the more input I will get the better it will. I have literally no way to imagine everything.
I'm currently working on v4.0.0bN
and there is in my opinion - after analysis - no real reason to split everything anymore. This tool should decode as much as possible.
Therefore, I'm willing to change the direction: What about PyFunceble trying to code as much as possible - if not all.
Inputs from users are highly welcome because I'm not actively writing blocklists.
@keczuppp thank you for your table which I will use for the tests.
Is this new direction fair enough (for everyone)?
cc @kulfoon @spirillen @dnmTX @
Moved my answer to https://github.com/funilrys/PyFunceble/issues/227 as It's OT to OP's post and I hope there will be more activities in replies to this topic Therefore, I'm willing to change the direction:
(my reply is also a reply to https://github.com/funilrys/PyFunceble/issues/227#issuecomment-797239112 at the same time):
Please take my commit and the underlying tests as the response. Is it still too much @keczuppp ?
Let's discuss the future of that specific decoder. I'll inject any future report about missing decoding into the tests. So the more reports, the better that decoder will be π
As I wrote, I'm not one of those who write a filter list... So help or directions are welcome!
Hello, I was already trying to test the new version of Adblock Decoder (4.0.0b35
) but:
Finished processing dependencies for PyFunceble-dev==4.0.0b35
D:\download_big_temp\_koding\PyFunceble-dev>pyfunceble
Traceback (most recent call last):
File "D:\download_big_temp\_koding\Python37\Scripts\pyfunceble-script.py", line 33, in <module>
sys.exit(load_entry_point('PyFunceble-dev==4.0.0b35', 'console_scripts', 'pyfunceble')())
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\cli\entry_points\pyfunceble\cli.py", line 1022, in tool
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\config\loader.py", line 370, in start
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\config\loader.py", line 331, in get_config_file_content
File "D:\download_big_temp\_koding\Python37\lib\site-packages\pyfunceble_dev-4.0.0b35-py3.7.egg\Py
Funceble\helpers\dict.py", line 290, in from_yaml_file
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\user\\AppData\\Local\\Temp\\tmp_
oox9mr2'
@keczuppp Thanks for the notice. I'll update the AdBlock decoder project as soon as possible.
The simple way, is the pyfunceble --syntax --adblock --aggressive -f [file]
arguments. π
Note to self: Cleanup documentation.
So I've tried the newest version v4.0.0b36.
and:
FileNotFoundError: [Errno 2] No such file or directory:
) so at least I was able to run the whole PyFunceble
Adblock Decoder
with the command you provided in your comment, with a few lists (EasyList and also two main polish ads filter lists), it gets broken and throws another errors:
Errors 1 log
spoiler - it analysed nothing, crashed at the beginningErrors 2 log
spoiler - it analysed some domains and then crashedErrors 3 log
spoiler - it analysed nothing, crashed at the beginning@keczuppp, b37 is available and it should fix the error you reported.
Thanks again for testing !
@keczuppp, the adblock-decoder is also upgraded to use the 4.0.0bX of PyFunceble.
yep, good work:
more tests later
And don't laught at me, fvcktard.
D:\download_big_temp\_koding>adblock2plain --aggressive -o output2.txt easylistpolish.txt
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\download_big_temp\_koding\Python37\Scripts\adblock2plain.exe\__main__.py", line 7, in <mo
dule>
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\cli.py", line 104, i
n adblock2plain
args.input_file, args.aggressive, output=args.output
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\core\adblock2plain.p
y", line 80, in process_conversion
for line in self.input:
File "d:\download_big_temp\_koding\python37\lib\encodings\cp1250.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1350: character maps to <unde
fined>
D:\download_big_temp\_koding>adblock2plain --aggressive -o output2.txt easylist.txt
Traceback (most recent call last):
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\download_big_temp\_koding\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\download_big_temp\_koding\Python37\Scripts\adblock2plain.exe\__main__.py", line 7, in <mo
dule>
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\cli.py", line 104, i
n adblock2plain
args.input_file, args.aggressive, output=args.output
File "d:\download_big_temp\_koding\python37\lib\site-packages\adblock_decoder\core\adblock2plain.p
y", line 80, in process_conversion
for line in self.input:
File "d:\download_big_temp\_koding\python37\lib\encodings\cp1250.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x83 in position 5977: character maps to <unde
fined>
And don't laught at me, fvcktard.
Who is laughing at you?
We are all here for some constructive work, enhancement, and discussion in our free time. I personally take any input I can get regarding the decoder. I don't have time to laugh at someone when they are giving constructive inputs.
If it is because of my emoji, sorry if it offended you. It wasn't meant to harm.
Your last 3 cases are now into the source code that's going to be deployed next.
I'm going to look into the issue of the standalone decoder later.
@keczuppp Please update and test the adblock-decoder
funilrys : I'm going to look into the issue of the standalone decoder later. @keczuppp Please update and test the adblock-decoder
1.2.0
funilrys : Your last 3 cases are now into the source code that's going to be deployed next.
Adblock Decoder 1.2.0
Adblock Decoder PyFunceble 4.0.0b39
it's fixed at 66%
because site21.com
is still not being extractedAlso:
--clean
could be added.==================================================
As for the domain=
filters (for ex. domain=page1.com|page2.com
):
domain=
failures, should extract all domains regardless what is on the left or right side of domain=
, this inculdes @@
filters (because in --aggressive
mode we should extract all domains)domain=
lines from several popular / big adblock filter lists + two polish lists: EasyList
, EasyPrivacy
, AdGuard Base
, AdGuard Tracking Protection
, Official Polish Filters for AdBlock, uBlock Origin & AdGuard
, EasyList Polish
and put into a single file / list (see domain=.zip
), which contain about 11425 domains, but Adblock Decoder extracts only about a half (5426 domains)&Type=Event.CPT&
-300x250.
-Background-1280x10241.
-spotify-com.akamaized.net
-tag.js
.
.cdn.digitaloceanspaces.com
.cdnjquery.com
.ch
.com
.criteo.com
.criteo.net
.digitaloceanspaces.com
.engageya.com
.filma24.
.gif
.html
.html|
.imagetwist.com
.impact-ad.jp
.jpg
.js
.js|
.m3u8
.min.js
.mp3
.mp4
.mp4.kakaoad.
.mp4|
.netdna-ssl.com
.php
.pl
.pornhub.com
.r.msn.com
.roofandfloor.com
.smithsonian.museum
.ssl-images-amazon.com
.ts
.xml
Hey @keczuppp
.. the fact it should be put in the conjunction with other parameters) in the (big) documentation, which might be not so obvious.
What are you missing in the docs? and Could you elaborate it in a issue at https://github.com/spirillen/PyFunceble/issues (The repo I make the docs from)
Was it more funny to you, than your friend being unable to view history of my comment #13 (comment) ? Why didn't you laught at him the same like at me? Oh, because he is your firend, so you can't laughs at your friends, just like you did at me
I'm sorry you have taken it this way, it is not in any evil ways, and trust me we are laughing of each other "Error 40" we just happens to mostly do this on Keybase :smirk:. Next to this I can promise you that @funilrys never laughs (evilly) of everyone, it is always with the best intention from a good haert.
From re-reading https://github.com/funilrys/PyFunceble/issues/13#issuecomment-802307597 I can ensure it is a happy laugh for something went the right way, and you promised to do more tests.
Hi,
spirillen : What are you missing in the docs?
Rather nothing, I think now, I didn't get familiar enought with the PyFunceble, thus I just overlooked.
...maybe you should be careful where you put emotes.
Best regards.
Well, from my own point of view, I can only say: I'm feeling there are missing some emojis for the "fast feedback", meaning some would have to overlap others.
but in the embeded
Adblock Decoder PyFunceble 4.0.0b39
it's fixed at 66% becausesite21.com
is still not being extracted
I will look at it when I find time to touch that module back.
Yes, it was about "simple way" + the emoji, your comment looks like you wanted to show how stupid I am just because I missed something which you describe as "simple" (a parameter + the fact it should be put in the conjunction with other parameters) in the (big) documentation, which might be not so obvious.
Emojis are not necessarily meant to harm. My emoji was really not meant to harm, nor was it in an evil matter.
The usage of the "The simple way"
was just meant to introduce something which is not supposed to be hard and something that is supposed to also be easy to use and work as easy as possible. I do not expect everyone to know everything. Even I, have to look into my own source code to find something or provide a better answer to a question.
It was not meant to "humiliate", "shame" or "offend" you. If you understood it that way. I'm sorry it was not my intention.
You have to understand that most of the time, the solutions to the problem around here are extremely hard and need a bit of thinking or hacking of me. So I'm extremely happy of myself to present, find or have a simple solution to a problem. I was just happy to have a simple solution. I'm sorry that it was misunderstood.
Was it more funny to you, than your friend being unable to view history of my comment https://github.com/funilrys/PyFunceble/issues/13#issuecomment-797071985 ? Why didn't you laught at him the same like at me? Oh, because he is your firend, so you can't laught at your friends, just like you did at me.
I don't laugh at others. Not that (discrete) way. I'm not like that and it's not my intention. And even when it is the case, it's not in a harmful way.
About the nonreaction to other answers: My time is limited, I don't have time to answer everything. In fact, at the time I'm writing this, I still have around 200 GitHub notifications (Email excluded) to read, answer, sort, and/or put into my Open Source backlog. So most of the time, I'm trying to focus on information that is bringing me more information about what needs to be done or what is actually asked (technically). Side discussions into a feed that are not relevant (at a time X) to what is actually my goal or the goal of the issue are not always my priority. In fact, you may have seen me in the past jumping around multiple comments in the past because what is said is becoming relevant after implementation or a few weeks of changes.
I could laught (by putting a laugh emoji) at him just like you at me, furthermore, I could lught at you (by putting a laugh emoji), every time the Adblock Decoder or PyFunceble crashes making you looking like a fool (but your are a good developer and bugs are normal thing in programming, unavoidable by a human being).
Don't take it too personally, but I'm doing this in my free time. So a good laugh after a long day is not always bad. And sometimes a laughing emoji can be the beginning of a good joke or friendship.
About crashes, PyFunceble 4.0.0 is actually in beta for a good reason. And you are invited to submit all the fatal errors that PyFunceble produces. PyFunceble is not only doing one thing and most of the time it depends on the inputted dataset. As I can't test all possible imaginable datasets - especially when writing a decoder. I'm somehow "bound" to the error report. That's what you indirectly did. And that's what led to a significant change in the source code.
In fact, there was so much feedback on the 4.0.0 version that is the most tested version ever of PyFunceble. Making it probably. one of the less error-prone versions. Nothing is perfect but there is hope that this new version brings less error and more stability.
Really? Then why didn't you explain what was the purpose of the emoiji in your comment then.
Why should someone lose time to explain all single nontechnical decisions when it's not necessary? I didn't judge it necessary. Now I still took the time to explain...
But the fact is: I do have private, professional, familial, and public (through here) lives. So each minute I lose-d to explain an emoji or the choice of an X or Y word in a sentence is a time I could use to do something else to actually answer more technical questions or simply help move forward.
An emoji shouldn't be given that much time and energy. It happened, I understand it offended you or made you uncomfortable. I can't promise that you won't be offended again as I can't speak for others who use this platform. But be sure that your message and feelings came "laud and clear".
Do you want to just tell me you put the emoji for no reason, or just because you were in a happy mood and just by accident it was looking like you were laughting at me...
I wasn't laughing at you nor at the situation. If I would, I would have used:
Or is it not appropriate enough?
I don't believe your cheap explanation, lie to yourself, I spent much time analysing, whether your intention was to laught at me or not, and something said me very clear you were.
Believe it or not, when I say such a thing, I mean it. My intention was and is not to laugh at you.
abused by trolls, they abuse emojis to troll other people at every occasion
Why would I literally troll others on "every occasion" (or not) on GitHub when I have other things to do. I just want to move forward, help, and code if I get the chance and the time for it. It's all in my free time. Troll do "their thing" all day long, probably in their free time too but I believe that they are not as busy as some of us.
I consider you a positive person and great developer overall, but just don't do it again.
Thank you for the compliment. I'm indeed a positive person. I'm not looking to harm anyone. I was harmed enough in my life to know. I didn't think that that sentence with an emoji will harm someone. People who know me, know that I'm not a troll or someone who constantly "humiliate", "shame" and/or "offend" someone because of a lack of knowledge. I know that not everyone has the same knowledge. That's why I'm always happy to provide some of my expertise, knowledge, and help across multiple projects.
By the way, the word "fvcktard" was not necessary. I'm polite enough, and even if an emoji offended you, it's not a reason to use such a word. That's something that offended me but I chose to ignore it at the time I read it. Please avoid such language in the future.
Sorry for the misunderstanding. Cheers.
Please, can we park that emoji?? agreeing that you do not agreeing in the explanation and usage of it for the the specific situation??
And let's stick to the Code of conduct and get back to the actual topic, the error produced.
I'll hope so.
funilrys, can we get some cleaning in this thread, could you put in the spoiler your OFF-TOPIC https://github.com/funilrys/PyFunceble/issues/13#issuecomment-806238322, just like I did with my OFF-TOPICS, thx
OK, so I've just tested the newest PyFunceble dev right now and I've noticed that the reported issues mentioned in : https://github.com/funilrys/PyFunceble/issues/13#issuecomment-749607193 and https://github.com/funilrys/PyFunceble/issues/13#issuecomment-803537932 have been fixed.
The summarision:
keczuppp: As for the last 3 failures, many of such failures can be found in https://easylist-downloads.adblockplus.org/easylistpolish.txt The list contains about 2961 domains, but only 2459 are found by Adblock Decoder (with
--aggressive
option), which gives 83% efficiency.
keczuppp: currently the decoder extracts about a half of domains, to prove it I copy-pasted all
domain=
lines from several popular / big adblock filter lists + two polish lists:EasyList
,EasyPrivacy
,AdGuard Base
,AdGuard Tracking Protection
,Official Polish Filters for AdBlock, uBlock Origin & AdGuard
,EasyList Polish
and put into a single file / list (seedomain=.zip
), which contain about 11425 domains, but Adblock Decoder extracts only about a half (5426 domains)
Good improvement.
As reported by @dnmTX at https://github.com/Ultimate-Hosts-Blacklist/dev-center/issues/9:
are ignored.