Protocol-less URLs - Githubissues

wilkowy commented 7 years ago

It seems that RP does not support protocol-less URLs. Example:

<div style="background-image: url('//site.host.tld/image.png');"> or <img src="//site.host.tld/image.png">

Host blocked yet displayed.

FF: 48.0.2 RP: 1.0.beta12.4.1544.r5dc9768.pre

myrdd commented 7 years ago

Could you please check the request log, and maybe post a screenshot?

RP should get the full URI; it should work with protocol-relative uris.

wilkowy commented 7 years ago

Website: http://www.onet.pl

Rules (there are also others for those domains, but not relevant): Allow: .onet.pl -> .onet.pl Allow: .onet.pl -> .ocdn.eu Block: csr.onet.pl Block: mastt.ocdn.eu

Passed requests (allowed): https://s29.postimg.org/u0ap23mvr/Clipboard_1.jpg

Policy: block by default (w/o subdomains - unchecked; no subscriptions)

Hmm, what is the order? allow, block or otherwise?

myrdd commented 7 years ago

In your case you've got conflicting rules for these requests:

.onet.pl -> csr.onet.pl .onet.pl -> mastt.ocdn.eu

RP tries to guess which rule the user wants to be in effect, in case of such a conflict. RP gives "origin-to-dest" rules precedence; rules with origin or dest only will come after them. [1] You probably want rule priorities [2]; best if you upvote that issue.

To currently work around this problem, add these rules:

Block: -> csr.onet.pl Block: -> mastt.ocdn.eu

The asterisk will make the rules to "origin-to-dest" rules. This will make RP fall back to the default policy.

[1] https://github.com/RequestPolicyContinued/requestpolicy/commit/220a897b333a4b7b4c5deaeee64cba3ec2c6ea2a [2] https://github.com/RequestPolicyContinued/requestpolicy/issues/717

Hmm, what is the order? allow, block or otherwise?

So the order is:

1) "origin-to-dest" rules 2) origin-only and dest-only rules

If in one of those groups there are still conflicting rules, RP will fall back to the default policy.

myrdd commented 7 years ago

You probably want rule priorities

… or, even better, the blacklist. See my comment there: https://github.com/RequestPolicyContinued/requestpolicy/issues/443#issuecomment-268204543

wilkowy commented 7 years ago

Indeed, now it does seem to work. Didn't know about this policy change (still assumed empty host to be „*”) - must have skipped this change in the update's changelog. So now I have to update all (few) subdomain rules… ;) Yes, the top priority blacklist might be a good idea in this situation.

wilkowy commented 7 years ago

Anyway, what was the point of such change? Block is block, it shouldn't be overridden by allow. When blocking separate domains from the icon interface the added rules are currently always without asterisk (simple block „sub.host.tld”, not „* -> sub.host.tld”).

myrdd commented 7 years ago

(still assumed empty host to be „*”)

Without rule conflicts, this is true. However, “host-not-specified” indeed should be equal to “*”. So the workaround I mentioned above isn't guaranteed to work in future.

what was the point of such change?

Consider the following case. It's from https://github.com/RequestPolicyContinued/requestpolicy/issues/491, and it's the reason why this "policy change" has been introduced:

Block *.ajax.googleapis.com Allow *.mysite.com -> *.ajax.googleapis.com

In this case, the allow-rule should take precedence. The main question is: What should happen on a rule conflict? How and when should a conflict be resolved? In your case, the conflicting rules are:

Allow *.onet.pl -> *.onet.pl Block csr.onet.pl

The difference to the "googleapis" example is: The "block" rule is a full host. Or: The destinations of the two rules do not match. I can imagine two possible solutions to this issue: (a) Do not resolve this conflict, so fall back to the default policy; or (b) resolve the conflict to "block" because csr.onet.pl is more specific than *.onet.pl. – I'd prefer solution "a". It's easy to implement and easy to understand (IMO). I think we should keep the policy system simple to avoid confusion. What do you think?

wilkowy commented 7 years ago

OK, I see the point. Indeed this system clearly lacks a third state. I think the "block by default" global policy should be split onto "block,allow + blacklist" and "allow,block + whitelist" (only one mode active at a time) and then rules should have an "important" checkbox marking them as belonging to either of the lists. Of course to avoid confusion while in blacklist - checkbox for allow rules should be greyed/disabled and the same for whitelist with block rules. Unfortunately this involves more complications with easy-to-use maintaining the rules… Or if devs prefer just stay with current solution which IMHO is ok now, but add an additional checkbox for block rules to mark them important (blacklist) or make a clear statement that block rules with asterisk origin are blacklisted (making blacklist is easier than whitelist, because allow rules almost always have an origin field filled so there would be a need for a separate checkbox).

myrdd commented 7 years ago

Wait, you can have both blacklist and whitelist, independently of your default policy. To be accurate, the policy is defined by: default policy; subscriptions; temporary user rules; "normal" user rules (allow,deny); user rules with priority, including blacklist/whitelist (priority=max).

What we could do though is providing a "fallback policy" setting: “Which policy—allow or deny—should apply on conflicting rules?” Is this what you want? I think most people want fallback to the default policy.

Still, I like your idea of the "important" keyword. Aliasing “rule with maximum priority” to “important rule” will be sufficient for most use cases and users. The fact that blacklist/whitelist is implemented using priority=max is an implementation detail; the "priority" can be shown in an "advanced rules view", and hidden by default.

wilkowy commented 7 years ago

No, no, through white/black list I perceive those important rules. The difference between "block,allow" vs. "allow,block" I understand like order of applying rules in global "block by default" policy:

block,allow = (current behavior, post 220a897)
1. block if blocked by rule having an origin
2. allow if found conflicting wildcard allow rule with block rule
3. block if blocked (match found)
4. allow if allowed (match found)
5. default policy
allow,block = (pre 220a897)
1. block if blocked (match found)
2. allow if allowed (match found)
3. default policy

Adding "important" marker would add a top level check. Blacklist is actually already implemented as an asterisk-origin block rule.

Is this what you want? I think most people want fallback to the default policy.

Well, I'm not implying that any change is required. I was just confused that connections started to leak, because I missed the conflict resolving update. I have only few such blocking rules for "safe" domains which block some tracking/annoyances inside the "safe zone". It's like ".twitter.com -> .twitter.com" is ok, but I prefer to block explicitly "analytics.twitter.com" (or some "pixel." trackers) which now just started slipping through and RP have not informed about it in any way since the change. Failing back to default (or preferably chosen) policy is a good idea as a global decision, as well as a more grainy b/w list.

RequestPolicyContinued / requestpolicy

Protocol-less URLs #824