Closed newhouse closed 6 years ago
Couldn't directly reproduce this issue in 4.1.0
. However the issue sounds familiar and the following comment from the uBlock Origin extension may give some insights as to what the cause might be.
To keep track from which context *exactly* network requests are made. This is
often tricky for various reasons, and the challenge is not specific to one
browser.
The time at which a URL is assigned to a tab and the time when a network
request for a root document is made must be assumed to be unrelated: it's all
asynchronous. There is no guaranteed order in which the two events are fired.
Also, other "anomalies" can occur:
- a network request for a root document is fired without the corresponding
tab being really assigned a new URL
<https://github.com/chrisaljoudi/uBlock/issues/516>
- a network request for a secondary resource is labeled with a tab id for
which no root document was pulled for that tab.
<https://github.com/chrisaljoudi/uBlock/issues/1001>
- a network request for a secondary resource is made without the root
document to which it belongs being formally bound yet to the proper tab id,
causing a bad scope to be used for filtering purpose.
<https://github.com/chrisaljoudi/uBlock/issues/1205>
<https://github.com/chrisaljoudi/uBlock/issues/1140>
So the solution here is to keep a lightweight data structure which only
purpose is to keep track as accurately as possible of which root document
belongs to which tab. That's the only purpose, and because of this, there are
no restrictions for when the URL of a root document can be associated to a tab.
Before, the PageStore object was trying to deal with this, but it had to
enforce some restrictions so as to not descend into one of the above issues, or
other issues. The PageStore object can only be associated with a tab for which
a definitive navigation event occurred, because it collects information about
what occurred in the tab (for example, the number of requests blocked for a
page).
The TabContext objects do not suffer this restriction, and as a result they
offer the most reliable picture of which root document URL is really associated
to which tab. Moreover, the TabObject can undo an association from a root
document, and automatically re-associate with the next most recent. This takes
care of <https://github.com/chrisaljoudi/uBlock/issues/516>.
The PageStore object no longer cache the various information about which
root document it is currently bound. When it needs to find out, it will always
defer to the TabContext object, which will provide the real answer. This takes
case of <https://github.com/chrisaljoudi/uBlock/issues/1205>. In effect, the
master switch and dynamic filtering rules can be evaluated now properly even
in the absence of a PageStore object, this was not the case before.
Also, the TabContext object will try its best to find a good candidate root
document URL for when none exists. This takes care of
<https://github.com/chrisaljoudi/uBlock/issues/1001>.
The TabContext manager is self-contained, and it takes care to properly
housekeep itself.
I just reviewed the web request notification api spec and it explains the issue at hand. when onBeforeRequest
is called the tabId may be null. This means that the cacheEntry added in background.js#L155 might reference a tabId of null.
The tabContextManager from uBlock origin is a stand alone object so it should be easy to integrate. A more vanilla solution may be achieved by calling storeChanges in onBeforeNavigate
since you will always have a tabId there. Doc says it is always called before onBeforeRequests.
Hi @menzow thanks for the digging and intel. I notice that you mentioned you were unable to reproduce this in 4.1.0, and I think I know why:
I was experiencing the described behavior while playing around on Facebook. What I ultimately learned was that, when considering a link/ad/post that had an l.php
URL:
I was not expecting this behavior, hence I filed the issue, but it's what I observed upon looking more closely. It seems like FB is tracking clicks using ajax instead of via redirects. As I noted elsewhere, I'm starting to see some behavior changes by the big players in terms of all this redirect link junk they've got all over the place. Maybe it's from the pressure this extension is putting on their implementations? Ha, yeah right!
Anyways, thanks for your intel here and it may be useful in the future, but for now I think I will close this issue as it's not what I originally thought was happening.
This example from Facebook Glen Hansard ad:
https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.ticketmaster.com%2Fevent%2F1C00535FC3873D42%3Fcamefrom%3DCFC_ANOTHERPLANET_web%26brand%3Danotherplanet&h=ATNuMbT9BzembgKERn5XuCZ0Ka_BEBsqvWfcAuDc05RwXj4iEqmq-2QdsawNPE90uNYKlXCTTZOXTlkfIokMyp55SIDJKB_qMxRmllqRqJEFZr8GY4ioBUQYCytrNKce3efZcfa5Bc_gLYcdbfHmj2o8Ylb_MKJ9FlUerGwGezavnCl5DCChq9ZFJEWDk59UDi1lCjbXbDjTG9kwPTHkdltkkQgyr2-e685Yl7NPz-qQie0HORBgAeQw3GXIaWpOTewsbfQCF8g5iNMl01WMop03Ir1cPUFdwigdriLZm4moGF-IHRueQnjmCtPJbynqw69KMIhplXqJbDbDdM0sDnZc9sPk3-lg0SSqClCGnDqV4W394nr4rS1TeB5dCyJruVhpWnCbF6AX_mvAffTIIQf7g-AYIPyeAVh7gFdBDmo8bCxWjMqOWr7fRt4ZgRbWeA0IpgA9iSM_Qi0_vbVTWuaUmrDr8Ld91WHk2xDYWGHKY-6CmsBPNmEvZkNH933UAB2cjxxghlWABjApCkUPGrnH2OYmGQTjpMKm_6VSjhlUOc4ElgeStxDBbsGEghQt3e_tnsY1aGsN-182ImNFrzwq9PcxOTCgNILLWDVk26lvCUAPTOCWTvWD8ajPm0LjkE18XjQ7NjBZsDD9zW1XwS7pPDFhAys0vwr1nI_i-HDRmQQqzDMeQ-CZjnJLYs4mDMpU0pwQ1Ppem9ADL6-A9NfeOM5MKHkcZezTbCwGAXiuP1dFKiYbGjwB90nXlEH9Bhv1QFlhvw_CPeEyQQGJzxh139-AFri2sLh_l6AVQ11k43081MVXd4GOQZcDwZJ4d9ZrodD74g4k-j5wqBU7c_GiZAzpU_zEfBj7jc6ZlunrYAhRF7gGpwirE49sZxRrklys-Diuq38caN6ya4U3XFhifBx0F9bnvRSGKSAngfAkhv1XZgM7
If I click on it, no indication given. If I paste into URL bar, indication IS given. Probably due to behavior of knowing tab id/url stuff on when to indicate an impact.