Open whizzzkid opened 1 year ago
This is a great writeup, thanks a ton @whizzzkid! I have a few comments and questions.
Use declarativeNetRequest API to update the URLs to be blocked up to the permissible limit.
It looks like that number is 50 for static rules, which shouldn't count for calls to updateDynamicRules
: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#property-MAX_NUMBER_OF_STATIC_RULESETS
And 5000 for dynamic + session rules: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#property-MAX_NUMBER_OF_DYNAMIC_AND_SESSION_RULES
LFU cache can be biased during a heavy browsing session which could make some URLs sticky. A good approach could be a hard-coded TTL to evict caches to manage the bias.
Was the caching strategy discussed anywhere? I'd like to ensure we stick to a well-known caching algorithm so we fully understand pros/cons. LFU has very clear cons that could easily be triggered if a user fills up the queue with requests. Have we looked into LRFU? LRFU may be more well-defined than "LFU + TTL", though no such implementation exists in typescript/javascript that i could find quickly.
How can the work at #1127 help us ensure we use the best caching policy?
LRU cache is not an option as that would evict staple URLs if not used regularly.
We could use cache for the sessionRules and leave the staple URLs as "dynamicRules": https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#dynamic-and-session-scoped-rules
Have we looked into updating/modifying any CID/ipfs/ipns links on a page to use some URL that we know we have a declarativeNetRequest rule for? That could be a way to reduce the scope & limits of cache and rules. something like:
¹ I'm not sure how difficult this is but I imagine it already hasn't been done for some reason. MV3 may provide more reasons to look at this though.
Thank you @whizzzkid, plan lgtm, awesome to see this moving forward :raised_hands:
Some thoughts:
We already produce different manifest.json for Firefox and Chromium, and swap it during final packaging. Everything else should be possible to detect at runtime (we have to, because we use the same Chromium build for Google Chrome and Brave).
@SgtPooki injecting JS on every page, and modifying page content makes browsing slower, especially when you dynamically inspect every newly created DOM branch. That is why things like linkification were abandoned (and are only opt-in experiments). Try other things first, this is a last resort, and even if we do this, win't work for XHR and fetch
done via JS on pages.
In case it is useful, in new bifrost-gateway we are using a basic 2Q cache:
2Q is an enhancement over the standard LRU cache in that it tracks both frequently and recently used entries separately. This avoids a burst in access to new entries from evicting frequently used entries.
Most of the work will be creating new test suite that tests against both old blocking webRequest
and ones with declarativeNetRequest
in a way that is easy to maintain going forward.
injecting JS on every page, and modifying page content makes browsing slower, especially when you dynamically inspect every newly created DOM branch. ... [won't] work for XHR and fetch done via JS on pages.
Any performance degradation could be reduced to ~nil fairly easily.
The callout for XHR and fetch requests are the gotchas I was figuring existed but couldn't think of at the time. Another set of targets this wouldn't catch easily are event listeners on various dom elements that trigger navigation.
Either way, we don't necessarily need to modify the dom, just read the values, but even modifying the dom shouldn't be an observable hit.
performance.mark('test-start'); // just for benchmarking
Array.from(document.querySelectorAll('a[href]')).forEach((el) => {el.setAttribute('href', el.href + '#foobar')}); // modify the dom
performance.mark('test-end'); // just for benchmarking
performance.measure('test-duration', 'test-start', 'test-end').duration
// 1 ms on this page, with ~200 links, and no filtering out of "matched" links.
Note that this is the least performant version (also contrived) and it's still only 1ms. Making it non-blocking might increase it by ~20%, but would reduce any UX impact on users.
Not modifying the dom: What we could do is simply parse any a[href]
links and add those to the dynamicRules to get a "jump" on any links a user might click. This doesn't solve the "crux" of the MV3 problem, but it should help mitigate the IPFS Ecosystem UX impact.
I think a simple update on page load could help reduce cache misses by pre-loading appropriate URLs with:
Array.from(document.querySelectorAll('a[href]')).forEach((el) => {updateRuleForPotentialIpfsLink(el.href)}); // calls to updateDynamicRules if appropriate
There does seem to be a 2q impl in javascript, but it's 4 yrs old and stale: https://www.npmjs.com/package/2q-cache
The only package I found that mentions LRFU is weak-lru-cache
, which is ESM, typescript, and fairly active: https://www.npmjs.com/package/weak-lru-cache
NOTE: weak-lru-cache supports pinning to the cache:
The expirationPriority can also be set to -1 that the value should be pinned in memory and never expire, until the entry has been changed (with a positive expirationPriority).
Answering some of my own questions w.r.t. https://github.com/ipfs/ipfs-companion/issues/1127: Once we migrate to MV3 I think we should add some additional metrics for:
updateDynamicRules
that were never triggered/used again -- RESULT: MV3 caused large degradation in IPFS ecosystem experienceupdateDynamicRules
that were triggered/used again -- RESULT: MV3 caused a light degradation to this users experience.@SgtPooki
Was the caching strategy discussed anywhere?
LRFU with localStorage persistence is what I'm looking for. But as you have already discovered, there isn't any decent implementation for that. I think it would involve another quick spike, will add to https://github.com/ipfs/ipfs-companion/issues/1155
We could use cache for the sessionRules and leave the staple URLs as "dynamicRules"
yep!
@lidel
we have to, because we use the same Chromium build for Google Chrome and Brave
So Brave's stance is a bit confusing on this, even though they assure that MV2 will be supported even if chromium moves to MV3, how will they provide support for chrome webstore?, which will stop accepting MV2 extensions starting June? Which means even though they might "technically" support MV2, the source would only have MV3. At that point they can decide to build their own webstore (in which case we can have a separate build) or support companion out of the box, in which case we can definitely have a separate build. I am not too keen on building the detection logic ship with extension, we can keep it simple by just building a different asset which can have a pluggable logic to handle the webRequests.
In case it is useful, in new bifrost-gateway we are using a basic 2Q cache
2Q sounds interesting, looks like someone has an implementation but no active contributions.
So Brave's stance is a bit confusing on this [..]
Yes, it is as confusing as you think, maybe even worse? :) Brave == Chromium build + runtime detection to enable Brave-specific things. Always been this way.
"Technically" you may have MV2 APIs available for a while, but you won't have a store that accepts extension with them if they need to be explicitly requested via Manifest (Chrome Web Store will reject). Brave does not plan to have own store for extensions, so you are stuck with Chromium Web Store's manifest limitations. And don't be naive, if supporting APIs removed from upstream Chromium codebase is too expensive to support, Brave will make a tough call and remove them (as soon the % of extensions that need it falls below acceptable threshold, and who would use these APIs if Brave users cant install your extension from the top search result, which is Chrome Web Store?).
Things change a lot in 5-10yr timeframes, and nothing is set in stone. Not that long ago we did not have Chromium-based Edge, and initial Brave was based on a fork of Electron. At one point we had companion backed by js-ipfs with TCP transport provided by chromium.sockets from ChromiumOS. Today is not crazier than yesterday, just different.
Browser landscape changes, and Companion needs to be nimble enough to stay afloat. We need to plan accordingly, and that is why runtime detection is the only future-proof play on our end.
Creating build pipeline around temporary landscape will be a pain to maintain. Please don't complicate it beyond current state, where we have a single codebase and one Manifest for Firefox, and one for Chromium, and everything else is decided at runtime.
2Q sounds interesting, looks like someone has an implementation but no active contributions.
We could implement it by hand, by adding a second cache that follows frequency. For recency, you already have standard LRU.
discuss.ipfs.tech post: https://discuss.ipfs.tech/t/announcing-ipfs-companion-mv3-rc-beta/16442
@whizzzkid : is there a new release in the beta webstore we can share here and in the discuss post?
@BigLep the latest beta is live: https://chrome.google.com/webstore/detail/ipfs-companion-rc-mv3-bet/hjoieblefckbooibpepigmacodalfndh
Adding it to the main post and discuss post.
@whizzzkid : I assume we'll do another beta release once we have everything in we expect (e.g., updated docs, MV2 to MV3 migration, metrics). I'll try using the beta again at that point.
@whizzzkid : from a sequencing regard, it may be advantageous to do https://github.com/ipfs/ipfs-companion/issues/1227 sooner than later as this one may take more time to validate that it's working as expected.
Introduction
The original issue is more than four years old, the community has been seeing the MV3 Saga unravel, but the reality in 2023 is we need to tackle this head-on to provide a path forward for users that rely on companion on such browsers.
🧪 Beta Testers Needed
Read Discuss Post: https://discuss.ipfs.tech/t/announcing-ipfs-companion-mv3-rc-beta/16442 Beta Webstore Link: here Report Beta Bugs: here
The Problem
MV3 in itself is not the issue, it's an innocent change going forward. The problem lies with the
webRequest
API and going forward:The Proposed Solution
The idea in simpler terms is to branch the builds such that we can produce a build for clients that support blocking
webRequest
API and the ones that do not (e.g. Firefox will support blockingwebRequest
even with the MV3 change)This boils down two having two implementations of the
webRequest
:Blocking WebRequest Build
Non-Blocking WebRequest Build
Identifying the nature of
webRequest
from code will be hard, but can be spiked out to validate if this is something we can deduce in code, otherwise we can split the build process such that we build artifacts based on what variation ofwebRequest
API client supports.The
Non-Blocking WebRequest
will ship with extra feature, which would:declarativeNetRequest
API to update the URLs to be blocked up to the permissible limit.declarativeNetRequest
API.Known Risks
Metrics Collection
Related Issues and PRs:
Deadline
March, 2023ETA: 2023-06-30
Dev Plan
Bugs
Release Plan
Post MV3 Tasks
Reviewers