Closed 0xD4 closed 10 months ago
It seems they block the user-agent.
To bypass this block you might use this rule.
- domain: tagesspiegel.de
headers:
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
content-security-policy: "script-src 'self';"
But it seems they have no backdoor for search engines.
@mms-gianni
Could you support a ruleset.yaml feature to block and modify URL paths?
I think this would make contributing easier for the layperson, as figuring out a bypass can be done with just the browser devtools without writing code.
For example, to bypass tagesspiegel.de,
you would append ?amp=1
to the end of the URL to get the Google AMP link, then block the following requests:
https://cdn.privacy-mgmt.com/wrapperMessagingWithoutDetection.js
https://widgets.opinary.com/a/tagesspiegel.js
I think this would be easier than writing javascript to wait for the blocker modal, then remove it, when this is a feasible bypass method.
Sounds good. Or a AMP flag, similar to the googleCache flag.
Seems like the way to request the AMP version of a site is not standard. Some sites do it via a special path like /amp/
, and others via URL query ?amp=1
, and others still by subdomain amp.example.com
.
The googleCache flag abstraction is nice, but I think it hides the implementation in such that it would make it harder for a new contributor to modify the ruleset by understanding other people's rules. Easier to learn by example if you can see exactly what is going on. Without a ruleset contribution guide, it's monkey-see monkey-do.
Anyway, I've submitted a PR that should allow you to modify the URL.
Cloudflare is blocking again. I tried to change the user agent, then I get the paywal. With the default settings I get blocked.
ladder: 0.0.17 Test-URL: tagesspiegel.de Screenshot:
Could someone explain how the problem can be solved with the ruleset? Then I can try to find a solution myself.