gorhill / uMatrix

uMatrix: Point and click matrix to filter net requests according to source, destination and type
GNU General Public License v3.0
4.56k stars 473 forks source link

Unwanted domain(s) listed! #221

Closed Alpengreis closed 8 years ago

Alpengreis commented 9 years ago

Hello gorhill/all

Sometimes - if a URL (site) NOT loaded from blank tab - unwanted domain(s) is/are listed in uMatrix (also in uBlock by the way).

Steps to reproduce:

  1. Load a site, which makes huge traffic (over much domains), such as a big Google Plus site.
  2. Load another site WITHOUT create a new tab first and WITHOUT relation to google.com. Instead type/take the new URL direct in the omnibox.
  3. Now - if you look in the uMatrix it's possible, that there is (for this example) nevertheless google.com - but the new site has nothing to do with this domain. At least, while point 1 is in use and traffic is not finished, this is a possible effect.

Also a refresh of the new site has no effect (note: in uBlock a refresh HAS an effect and google.com is away after).

Of course, if load a new URL in a New Tab (blank), this is never a problem.

Here a two examples:

This is from a loaded Google Plus site and NOT ok ... internetbox_notok

This is from a blank tab and OK ... internetbox_ok

The new URL is a local site, but this is an example only (I had the behaviour also with two external sites).

Possible this is not direct a problem from uMatrix. Nevertheless, it would be good uMatrix (and uBlock) could handle this ...

Is this "fixable" or not, resp. by Design?

By the way: I have this problem in Google Chrome with combo uMatrix/uBlock (both gorhill) in uBlock AND uMatrix (host-list removed in uBlock and no dynamic filtering is active in uBlock) AND also in Firefox with combo NoScript/uBlock (chrisaljoudi) in uBlock.

However: many thanks for answer(s) in advance!

Kind regards Alpengreis

PS: Thank you VERY much for SUCH a great tool and work! PPS: Sorry for my english, it's not "my" language ...

gorhill commented 9 years ago

Can you see in the logger the sequence of events? The logger will report in the exact order the network events were received. I would need an exact scenario to reproduce -- actual URLs with which you can reproduce the issue all the time.

I suspect this could be caused by the browser setting "Use a prediction service to help complete searches and URLs...".

Alpengreis commented 9 years ago

Here is an example, that always works, at least if I not wait a (very) long time after load the g+ site:

A) Load URL https://plus.google.com/107545467275966756564/posts?hl=de

B) Then URL http://www.swissvpn.net/

The screenshot from uMatrix ... umatrix_ex1 NOTE: Even a hard reload has no effect ...

The screenshot from uBlock ... ublock_ex1a

Here a normal reload (not hard) has effect ... ublock_ex1b

And here the log (truncated) ...

03:08:31 cookie http://www.swissvpn.net/{persistent-cookie:lang_cookie} 03:08:31 cookie http://google.com/{persistent-cookie:HSID} ... more such or related links snipped ... 03:08:31 cookie https://plus.google.com/{persistent-cookie:OTZ} 03:08:30 other https://talkgadget.google.com/u/0/_/diagnostics/?diagno ... more such or related links snipped ... 03:08:30 other https://csi.gstatic.com/csi?v=3&s=hangouts&action=&it=w 03:08:30 script http://www.swissvpn.net/{inline_script} 03:08:30 image http://www.swissvpn.net/images/gr_bg.gif ... more such or related links snipped ... 03:08:30 xhr https://play.google.com/log?format=json&u=0 03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B% 03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B% 03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B% 03:08:30 css http://www.swissvpn.net/svpntxt.css 03:08:30 other https://plus.google.com/_/diagnostics/?diagnostics=%5B% ... more such or related links snipped ... 03:08:30 other https://csi.gstatic.com/csi?v=3&s=oz&action=profload_st 03:08:30 xhr https://plus.google.com/_/stream/markitemread/?hl=de&oz 03:08:30 xhr https://play.google.com/log?format=json 03:08:30 doc http://www.swissvpn.net/ http://www.swissvpn.net/ 03:08:29 xhr https://8.client-channel.google.com/client-channel/chan 03:08:29 xhr https://play.google.com/log?format=json&u=0 03:08:29 xhr https://play.google.com/log?format=json 03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_video_shor 03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_video_long 03:08:28 other https://ssl.gstatic.com/chat/sounds/incoming_message_eb 03:08:27 xhr https://talkgadget.google.com/_/scs/talk-static/_/js/k= 03:08:26-- image https://ad.doubleclick.net/activity;src=2542116;type=so 03:08:26 cookie https://plus.google.com/{localStorage} 03:08:26 frame https://plus.google.com/_/blank 03:08:25 script https://ssl.gstatic.com/accounts/o/3655170095-postmessa 03:08:25 script https://oauth.googleusercontent.com/gadgets/js/core:rpc 03:08:25 script https://plus.google.com/u/0/_/notifications/frame?sourc 03:08:25 cookie https://plus.google.com/{localStorage} 03:08:25 css https://fonts.gstatic.com/s/roboto/v15/oMMgfZMQthOryQo9 03:08:25 image https://ssl.gstatic.com/s2/oz/images/notifications/spin 03:08:25 frame https://accounts.google.com/o/oauth2/postmessageRelay?p 03:08:25 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap 03:08:25 css https://plus.google.com/_/scs/apps-static/_/ss/k=oz.sbw 03:08:25 xhr https://8.client-channel.google.com/client-channel/chan 03:08:25 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap 03:08:25 frame https://plus.google.com/u/0/_/notifications/frame?sourci 03:08:24 script https://plus.google.com/hangouts/_/hscv?pvt=AMP3uWbBIKT 03:08:24 cookie https://plus.google.com/{localStorage} 03:08:24 xhr https://plus.google.com/_/scs/talk-static/_/js/k=wcs.ha 03:08:24 script https://apis.google.com/js/client.js 03:08:24 xhr https://8.client-channel.google.com/client-channel/chan 03:08:24 xhr https://8.client-channel.google.com/client-channel/chan 03:08:24 xhr https://8.client-channel.google.com/client-channel/chan 03:08:24 frame https://plus.google.com/hangouts/_/hscv?pvt=AMP3uWbBIKT 03:08:24 image https://lh3.googleusercontent.com/-Sbi02TE9dLg/U97nH3FD 03:08:24 image https://lh3.googleusercontent.com/proxy/RcgZvDoRzsSPg98 03:08:24 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom 03:08:24 xhr https://8.client-channel.google.com/client-channel/chan 03:08:24 script https://talkgadget.google.com/u/0/talkgadget/_/frame?v= 03:08:24 image https://ssl.gstatic.com/ui/v1/activityindicator/offline 03:08:24 image https://ssl.gstatic.com/chat/babble/sprites/common-301c 03:08:23 xhr https://8.client-channel.google.com/client-channel/chan 03:08:23 image https://ssl.gstatic.com/ui/v1/activityindicator/loading 03:08:23 xhr https://plus.google.com/_/stream/getactivities/?hl=de&o ... more such or related links snipped ... 03:08:23 xhr https://8.client-channel.google.com/client-channel/gsid 03:08:23 script https://talkgadget.google.com/u/0/talkgadget/_/frame?v= 03:08:23 xhr https://8.client-channel.google.com/client-channel/gsid 03:08:23 frame https://talkgadget.google.com/u/0/talkgadget/_/frame?v= 03:08:23 script https://8.client-channel.google.com/client-channel/clie% 03:08:23 frame https://talkgadget.google.com/u/0/talkgadget/_/frame?v= 03:08:23 script https://apis.google.com/js/api.js 03:08:23 script https://8.client-channel.google.com/client-channel/js/1 03:08:23 script https://clients4.google.com/invalidation/lcs/client?xpc 03:08:23 frame https://8.client-channel.google.com/client-channel/clie% 03:08:22 xhr https://plus.google.com/_/profiles/getprofilepagephotos 03:08:22 script https://talkgadget.google.com/u/0/talkgadget/_/chat?cli0 03:08:22 xhr https://talkgadget.google.com/_/scs/talk-static/_/ss/k= 03:08:22 cookie http://talkgadget.google.com/{session-cookie:llbcs} 03:08:22 script https://talkgadget.google.com/_/scs/talk-static/_/js/k= 03:08:22 frame https://talkgadget.google.com/u/0/talkgadget/_/chat?cli0 03:08:22 script https://talkgadget.google.com/_/scs/talk-static/_/js/k= 03:08:22 xhr https://plus.google.com/_/socialgraph/lookup/people/?if 03:08:22 image https://ssl.gstatic.com/s2/oz/images/circles/cpw-7de38e 03:08:21 cookie https://clients5.google.com/{localStorage} 03:08:21 xhr https://plus.google.com/_/profiles/getfollowercount/101 03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom 03:08:21 xhr https://plus.google.com/_/people/notify?soc-app=1&cid=0 03:08:21 script https://talkgadget.google.com/u/0/talkgadget/_/host-js? 03:08:21 frame https://talkgadget.google.com/u/0/talkgadget/_/blank 03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom 03:08:21 image https://lh3.googleusercontent.com/-HDk4PX0tPv8/AAAAAAAA 03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom ... more such or related links snipped ... 03:08:21 xhr https://play.google.com/log?format=json 03:08:21 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom ... more such or related links snipped ... 03:08:20 image https://ssl.gstatic.com/s2/oz/images/sprites/profiles_s 03:08:20 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom ... more such or related links snipped ... 03:08:19 script https://apis.google.com/_/scs/apps-static/_/js/k=oz.gap 03:08:19 script https://plus.google.com/107545467275966756564/posts?hl= 03:08:19 cookie https://plus.google.com/{localStorage} 03:08:19 image https://ssl.gstatic.com/s2/oz/images/sprites/collection ... more such or related links snipped ... 03:08:19 css https://fonts.gstatic.com/s/roboto/v15/El-bgsteBznJNL5p 03:08:19 image https://ssl.gstatic.com/s2/oz/images/sprites/stream_sho ... more such or related links snipped ... 03:08:19 xhr https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom 03:08:19 script https://www.gstatic.com/og/_/js/k=og.og.en_US.fC5iiuc_6 03:08:19 css https://fonts.gstatic.com/s/roboto/v15/tZdhd9Zzj0I2MwoD 03:08:19 css https://fonts.gstatic.com/s/roboto/v15/N5Lbe1fynPA1KT8B 03:08:19 image https://lh5.googleusercontent.com/-HDk4PX0tPv8/AAAAAAAA 03:08:19 image https://ssl.gstatic.com/gb/images/v1_376447c3.png 03:08:18 image https://images-pos-opensocial.googleusercontent.com/gad ... more such or related links snipped ... 03:08:18 image https://maps-api-ssl.google.com/maps/api/staticmap?size ... more such or related links snipped ... 03:08:18 image https://s2.googleusercontent.com/s2/favicons?alt=p&doma 03:08:18 image https://lh3.googleusercontent.com/proxy/6UPOhVeV7V5x3zA ... more such or related links snipped ... 03:08:18 image https://ssl.gstatic.com/s2/oz/images/logo/2x/googleplus 03:08:18 script https://plus.google.com/_/scs/apps-static/_/js/k=oz.hom 03:08:18 css https://plus.google.com/_/scs/apps-static/_/ss/k=oz.hom 03:08:18 cookie http://google.com/{persistent-cookie:HSID} ... more such or related links snipped ... 03:08:18 cookie https://plus.google.com/{persistent-cookie:OTZ} 03:08:17 doc https://plus.google.com/107545467275966756564/posts?hl= https://plus.google.com/107545467275966756564/posts?hl=de 03:08:09 cookie http://google.ch/verify{persistent-cookie:SNID} ... more such or related links snipped ... 03:08:08 other https://www.google.ch/_/chrome/newtab/manifest?espv=2&i ... more such or related links snipped ... 03:08:08 image https://www.google.ch/images/srpr/logo9w.png 03:08:08 script https://www.google.ch/xjs/_/js/k=xjs.ntp.en_US.BFZSFxB- ... more such or related links snipped ... 03:08:08 cookie http://google.ch/{persistent-cookie:PREF} 03:08:08 cookie http://google.ch/{persistent-cookie:PREF} 03:08:08 other https://plus.google.com/_/diagnostics/?diagnostics=%5B% ... more such or related links snipped ... 03:08:08 doc https://www.google.ch/_/chrome/newtab?espv=2&ie=UTF-8 https://www.google.ch/_/chrome/newtab?espv=2&ie=UTF-8

NOTE: loaded BOTH through bookmark, NOT over omnibox ...

About your suspect "Use a prediction service to help complete searches and URLs...". It SEEMS, it's not the reason after a test, but I make further tests (with empty cache first, restart browser, and some other things). I'll report it here, IF this is responsible, of course!

Thank you for your help!

Alpengreis

Alpengreis commented 9 years ago

I have now the same combo uBlock & uMatrix in Fx (plus NoScript for some things), and I was able to reproduce this behaviour too ...

Alpengreis commented 9 years ago

Gorhill, could you please check this behaviour? It's such an annoying thing and can be even dangerous! It's the case in Chrome and Firefox. And ist even the the case in a complete new (clean) installation on Win 10 ...

Thank you!

adrienbeau commented 9 years ago

I have seen this too, especially using the Google search engine (but maybe this is because I use it a lot).

Here's how I can reproduce it fairly well:

Here a screenshot of the page matrix: umatrix-adrien-beau-free-fr

And here's a big screenshot of the uMatrix log resulting from this test (most relevant lines at the top): screen shot 2015-08-04 at 13 30 36

gorhill commented 9 years ago

@Alpengreis

Your original bug seems to be a side effect of how the browser API works: network requests are associated with tab, not with web page, and because of this, it is possible that network requests from a previous page are seen by a new page -- and there is no way for an extension to decide to which web page a specific network request originates, it can only tell from which tab. The fact that refreshing the page does not change the matrix content is by design in uMatrix: uMatrix will cache and reuse the data, until a few minutes after the page has not been visited. The reason for this is that if a web page make unfrequent requests to some specific 3rd-party hostnames, you still want to keep that information around a bit so that the user is properly informed about this.

@adrienbeau

Keep in mind this: URL redirections.

The way redirections are detected is different in uMatrix, I may look into this to see if I can improve to get the same results as how HTTP Switchboard behaved. In your case, clicking on a link in Google search result always results in a redirection (because Google wants to know which link you clicked).

adrienbeau commented 9 years ago

Thanks for the link, I didn't know about it.

I know about the Google search redirection, and it can actually be seen in the screenshot. The second gray bar from the top is when I middle-clicked to open in a new tab (at 13:25:26, more than one minute after displaying the search results). We can see Google set some cookies at that point, and then redirected to my site. uMatrix is apparently able to decide it is a new site, since it displays a grey bar for it. Maybe some Google requests were still lingering at that point, I'm not up-to-date with what concurrent events can happen with Javascript these days.

I understand it is not easy to decide when the requests are "coming from a new site", or "still issued by the current site"; maybe a good FAQ is the best solution to this issue.

Alpengreis commented 9 years ago

I understand the tech explanation, thank you, Gorhill!

The problem is, if I load a new page in the same tab, and I save a new rule for this page, I have included a possible unwanted domain. AFAIK, I never had this behaviour in NoScript (it's not the very same, I know, nevertheless ...).

For example I load a Google Plus page and after my local Router page, I see there the google domain. But this site has no link to google. It's how you said: It's only in the same tab as g+ before was. If I do not check this explicit (load in new tab) and I save this (allow google), I have a "false" record.

In daily work, this means for me: I have EVERYTIME to close the actual tab first, resp. I have to load a new tab, before I load a new page - to ensure, that on the new page are only "valid" domains listed.

As workaround I had used the Tab Mix Plus, which opens normally automatically a new tab. But I would like to reduce plugins now. BUT: the situation is not very user friendly at this point.

I hope, you understand me (enough), my english is not very good :-)

However: I hope you can change somehow this behaviour, else I must live with it ...

Info: I mean NOT REAL included links, such as google or whatever (they are in many many pages, I know that).

Also interesting: why can NoScript handle this proper? Is this also by design?

Kind regards!

gorhill commented 9 years ago

why can NoScript handle this proper?

I can't answer as I know nothing about NoScript code. One thing is for sure though, is that NoScript does not report xmlhttprequest, which are the network requests most likely to be affected by the current issue.

Alpengreis commented 9 years ago

Okay, then I leave my workaround active, not soooo a big thing.

Thank you, Gorhill!

PS: For other users with Firefox: I use the AddOn "Tab Mix Plus" and configured it to open relevant things in NEW TABS - so, it's a practicable workaround for this behaviour ...

Alpengreis commented 8 years ago

This seems to be fixed now in uBlock but not yet in uMatrix! Could you fix it in uMatrix too, please? This would be so important, to avoid extra AddOn!

gorhill commented 8 years ago

What version of uMatrix? Give me steps to reproduce please, that will save me time.

OK reproduced with latest build dev.

gorhill commented 8 years ago

From what I can see, there is a beforeunload event listener on the Google+ page, which executes after the new document started loading. As said, uMatrix is being told from which tab a network request occurs, not from which document URL.

To give some perspective, even the Network pane in the dev console will report network requests to google.com when loading the second site -- so the issue is not specific to uMatrix.

Alpengreis commented 8 years ago

Thanks for answer, gorhill! I use also latest Dev (with latest Fx Release (42.0)).

I had reasked, because it's NO MORE the case with uBlock (before it was).

Would it be possible to have the same behaviour in uMatrix as in uBlock? The problem is, it's really difficult to handle these (unnecessary? or at least undesired) domains in the matrix, even if they are not in uBlock ...

Or in other words: how can I decide that such domains are from the webpage itself or not - without other tools? Or: if such domains are there, I will not make a relation the page itself, if it's not from the page source.

The only possibility to avoid this behaviour with uMatrix is: I have to look ALWAYS in uBlock OR I have to open EACH link in a new Tab. In Fx, this is relatively easy with Tab Mix Plus (TMP), but with Chrome, I don't know an extension for this (or they does not work (correctly) - so it's necessary to make it ALWAYS "manually".

So uBlock makes it "okay", NoScript makes it okay, uMatrix not.

Or exist any reason to leave this so in uMatrix?

Many greetings!!

gorhill commented 8 years ago

I don't know why it does not happen with uBO, it should, the logger reports google.com after the second document has started loading. I need to investigate why. NoScript does not report google.com because what is pulled are images, not scripts.

Alpengreis commented 8 years ago

Okay, thanks. It WAS also the case with uBO as I had made my first posting here ...

However, have a nice week yet, gorhill!

Alpengreis commented 8 years ago

@gorhill Yes, you have right. Indeed uBlock should display the "unwanted Domain" too! This is the result after loading first google maps (https://www.google.ch/maps/) and then http://www.swissvpn.net/ in the SAME tab.

ubo_1

ubo_2

This is really annoying to not have at least the same result in uMatrix and uBO.

Alpengreis commented 8 years ago

I have news about this ...

It seems it's NOT the beforeunload listener!

I have disabled this in Firefox (latest Release) in the config (dom.disable_beforeunload = true).

XHR seems to be involved. I had allowed on the google maps page ALL except XHR. After loading swissvpn.net NO google entry. After switch the XHR also to allow: BOOM, google is present after loading swissvpn.net in same Tab.

PS: Could this have to do something with the AV Scanner and/or BehindTheScene? PPS: Another idea is the onreadystatechange ... https://developer.mozilla.org/en-US/docs/Web/Events/readystatechange which I have found in the Google Page Source Code.

Alpengreis commented 8 years ago

Interesting. I had to install Chrome again. I installed the v47.0.2526.80 m (64-bit). There, this problem does NOT exist (tried with the same links above). Even not in the log, as you can see here ...

uMatrix: umatrix-log

uBO: ublock-log

Alpengreis commented 8 years ago

@gorhill

The problem exist also in Chrome ("again")! So it's NOT a browser bug (at least not in Fx only) ...

I could reproduce with the following process ...

1) Load URL http://www.20min.ch/sport/fussball/story/Sion-belohnt-seinen-Sturmlauf-15059427 2) Make a refresh without cache (Ctrl+F5) 3) Load URL https://www.mywot.com/en/scorecard/mywot.net in the same tab 4) GO BACK function to go back to site in 1) 5) LOAD URL https://www.mywot.com/en/scorecard/mywot.net in the same tab 6) Repeat Step 4 and 5 (one time SHOULD be enough)

After this, the 20min appears in the uMatrix of WOT ...

Also here: NOT in uBlock, only in uMatrix!

gorhill commented 8 years ago

I explained why this happened:

Alpengreis commented 8 years ago

Okay, NOW I have understood! Sorry for my long time to check this and the trouble! Thank you!

Alpengreis commented 8 years ago

Can you please answer the following yet:

Today, I found out with the Fx example (google maps -> swissvpn.net) that it's NOT a problem, if I WRITE the address in the omnibar.

This means: if I open the new link in same tabe with a Mouse click or Enter from Bookmarks, it's a problem - if I write the new link and press Enter, (opens also in the same tab) it's not a problem.

Can you explain this? I was wrong: also with omnibar exists this behaviour!