Companion MV3 Plan - Roadmap

whizzzkid commented 1 year ago

Introduction

The original issue is more than four years old, the community has been seeing the MV3 Saga unravel, but the reality in 2023 is we need to tackle this head-on to provide a path forward for users that rely on companion on such browsers.

🧪 Beta Testers Needed

Read Discuss Post: https://discuss.ipfs.tech/t/announcing-ipfs-companion-mv3-rc-beta/16442 Beta Webstore Link: here Report Beta Bugs: here

The Problem

MV3 in itself is not the issue, it's an innocent change going forward. The problem lies with the webRequest API and going forward:

The non-blocking implementation of the webRequest API, which allows extensions to observe network requests, but not modify, redirect, or block them (and thus doesn't prevent Chrome from continuing to process the request) will not be discouraged. As an alternative, we plan to provide a declarativeNetRequest API.

The Proposed Solution

The idea in simpler terms is to branch the builds such that we can produce a build for clients that support blocking webRequest API and the ones that do not (e.g. Firefox will support blocking webRequest even with the MV3 change)

This boils down two having two implementations of the webRequest:

Blocking WebRequest Build
Non-Blocking WebRequest Build

Identifying the nature of webRequest from code will be hard, but can be spiked out to validate if this is something we can deduce in code, otherwise we can split the build process such that we build artifacts based on what variation of webRequest API client supports.

The Non-Blocking WebRequest will ship with extra feature, which would:

observe the webRequest URLs.
Implement a LFU-Cache to update the list of URLs serviceable by companion.
Use declarativeNetRequest API to update the URLs to be blocked up to the permissible limit.
There exists a sample implementation using declarativeNetRequest API.
Redirect requests as companion would normally do.

Known Risks

It is understood that there would be cases where the URLs will be serviceable by companion but not blocked, but would eventually be blocked and served by configured node.
LFU cache can be biased during a heavy browsing session which could make some URLs sticky. LRU cache is not an option as that would evict staple URLs if not used regularly. A good approach could be a hard-coded TTL to evict caches to manage the bias.

Metrics Collection

Optionally we can collect top N hosts from the LFU to ship by default with the companion releases. This could cover a huge base and give companion a head start, this will later auto-optimize based on the usage patterns.

Related Issues and PRs:

Deadline

~~March, 2023~~
September -ish, 2023

ETA: 2023-06-30

Dev Plan

### Tasks
- [ ] https://github.com/ipfs/ipfs-companion/issues/1169
- [ ] https://github.com/ipfs/ipfs-companion/issues/1153
- [ ] https://github.com/ipfs/ipfs-companion/issues/1154
- [ ] https://github.com/ipfs/ipfs-companion/issues/1191
- [ ] https://github.com/ipfs/ipfs-companion/issues/1156
- [x] ~~https://github.com/ipfs/ipfs-companion/issues/1155~~
- [ ] https://github.com/ipfs/ipfs-companion/issues/1185
- [x] Validate MV3 extension publish to the webstore.
- [x] Add/Validate chrome-extension ID to kubo-API Access-Control-Allow-Origin Hosts
- [ ] https://github.com/ipfs/ipfs-companion/issues/666
- [ ] https://github.com/ipfs/ipfs-companion/issues/1226
- [ ] https://github.com/ipfs/ipfs-companion/issues/1227
- [ ] https://github.com/ipfs/ipfs-companion/issues/1175
- [ ] https://github.com/ipfs/ipfs-companion/issues/1207
- [ ] https://github.com/ipfs/ipfs-companion/issues/1231
- [ ] https://github.com/ipfs/ipfs-companion/issues/1190
- [ ] https://github.com/ipfs/ipfs-companion/issues/1235

Bugs

- [ ] https://github.com/ipfs/ipfs-companion/issues/1195
- [ ] https://github.com/ipfs/ipfs-companion/issues/1196
- [ ] https://github.com/ipfs/ipfs-companion/issues/1199
- [ ] https://github.com/ipfs/ipfs-companion/issues/1202
- [ ] https://github.com/ipfs/ipfs-companion/issues/1203
- [ ] https://github.com/ipfs/ipfs-companion/issues/1204
- [ ] https://github.com/ipfs/ipfs-companion/issues/1217
- [ ] https://github.com/ipfs/ipfs-companion/issues/1238
- [ ] https://github.com/ipfs/ipfs-companion/issues/1234
- [ ] https://github.com/ipfs/ipfs-companion/issues/1212
- [ ] https://github.com/ipfs/ipfs-companion/issues/1209
- [ ] https://github.com/ipfs/ipfs-companion/issues/1243
- [ ] https://github.com/ipfs/ipfs-companion/issues/1244
- [ ] https://github.com/ipfs/ipfs-companion/issues/1248
- [ ] https://github.com/ipfs/ipfs-companion/issues/1251
- [ ] https://github.com/ipfs/ipfs-companion/issues/1253
- [ ] https://github.com/ipfs/ipfs-companion/issues/1254
- [ ] https://github.com/ipfs/ipfs-companion/issues/1255
- [ ] https://github.com/ipfs/ipfs-companion/issues/1269
- [ ] https://github.com/ipfs/ipfs-companion/pull/1275
- [x] [Ergo] switch `supportsBlock` to `supportsDeclarativeNetRequest` https://github.com/ipfs/ipfs-companion/pull/1182/commits/ec2da966d4cb4919106b260336df02db8a8c6efd
- [x] [Ergo] auto-migrate to external from embedded. https://github.com/ipfs/ipfs-companion/pull/1182/commits/ec2da966d4cb4919106b260336df02db8a8c6efd

Release Plan

### Tasks
- [x] Publish weekly RC to IPFS Companion Beta
- [x] Announce RC Beta On https://discuss.ipfs.tech/: https://discuss.ipfs.tech/t/announcing-ipfs-companion-mv3-rc-beta/16442
- [ ] Implement Analytics to track frontend/sw crashes.
- [x] ~~Controlled Roll-Outs~~
- [ ] Deploy 🚀

Post MV3 Tasks

### Tasks
- [x] Cleanup `POSSIBLE_NODE_TYPES` https://github.com/ipfs/ipfs-companion/pull/1182#discussion_r1322055450
- [x] Backfill Tests TODOs: https://github.com/ipfs/ipfs-companion/pull/1276/
- [x] Add conditional types to improve contextMenu.ts https://github.com/ipfs/ipfs-companion/pull/1182#discussion_r1325046793
- [ ] https://github.com/ipfs/ipfs-companion/issues/1279
- [ ] https://github.com/ipfs/ipfs-companion/issues/1277
- [ ] https://github.com/ipfs/ipfs-companion/issues/1278

Reviewers

@lidel
@SgtPooki

SgtPooki commented 1 year ago

This is a great writeup, thanks a ton @whizzzkid! I have a few comments and questions.

Use declarativeNetRequest API to update the URLs to be blocked up to the permissible limit.

It looks like that number is 50 for static rules, which shouldn't count for calls to updateDynamicRules: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#property-MAX_NUMBER_OF_STATIC_RULESETS

And 5000 for dynamic + session rules: https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#property-MAX_NUMBER_OF_DYNAMIC_AND_SESSION_RULES

LFU cache can be biased during a heavy browsing session which could make some URLs sticky. A good approach could be a hard-coded TTL to evict caches to manage the bias.

Was the caching strategy discussed anywhere? I'd like to ensure we stick to a well-known caching algorithm so we fully understand pros/cons. LFU has very clear cons that could easily be triggered if a user fills up the queue with requests. Have we looked into LRFU? LRFU may be more well-defined than "LFU + TTL", though no such implementation exists in typescript/javascript that i could find quickly.

How can the work at #1127 help us ensure we use the best caching policy?

LRU cache is not an option as that would evict staple URLs if not used regularly.

We could use cache for the sessionRules and leave the staple URLs as "dynamicRules": https://developer.chrome.com/docs/extensions/reference/declarativeNetRequest/#dynamic-and-session-scoped-rules

Further thoughts

Have we looked into updating/modifying any CID/ipfs/ipns links on a page to use some URL that we know we have a declarativeNetRequest rule for? That could be a way to reduce the scope & limits of cache and rules. something like:

companion modifies any recognize-able ipfs link on a page¹ and updates the URL to point to some "companion-extension-content-loader-page" - let's call that page LP from here on out.
companion sets a single declarative request for LP
All requests to LP are blocked/redirected as necessary.

¹ I'm not sure how difficult this is but I imagine it already hasn't been done for some reason. MV3 may provide more reasons to look at this though.

lidel commented 1 year ago

Thank you @whizzzkid, plan lgtm, awesome to see this moving forward :raised_hands:

Some thoughts:

We already produce different manifest.json for Firefox and Chromium, and swap it during final packaging. Everything else should be possible to detect at runtime (we have to, because we use the same Chromium build for Google Chrome and Brave).
@SgtPooki injecting JS on every page, and modifying page content makes browsing slower, especially when you dynamically inspect every newly created DOM branch. That is why things like linkification were abandoned (and are only opt-in experiments). Try other things first, this is a last resort, and even if we do this, win't work for XHR and fetch done via JS on pages.
In case it is useful, in new bifrost-gateway we are using a basic 2Q cache:

2Q is an enhancement over the standard LRU cache in that it tracks both frequently and recently used entries separately. This avoids a burst in access to new entries from evicting frequently used entries.
Most of the work will be creating new test suite that tests against both old blocking webRequest and ones with declarativeNetRequest in a way that is easy to maintain going forward.
- I suggest creating totally new test suite for both, and don't touch the old one, use it as reference regression test. When we finish new test suite, and it covers both runtimes, we can remove the old one and only maintain new one going forward.

SgtPooki commented 1 year ago

injecting JS on every page, and modifying page content makes browsing slower, especially when you dynamically inspect every newly created DOM branch. ... [won't] work for XHR and fetch done via JS on pages.

Any performance degradation could be reduced to ~nil fairly easily.

The callout for XHR and fetch requests are the gotchas I was figuring existed but couldn't think of at the time. Another set of targets this wouldn't catch easily are event listeners on various dom elements that trigger navigation.

Either way, we don't necessarily need to modify the dom, just read the values, but even modifying the dom shouldn't be an observable hit.

performance.mark('test-start'); // just for benchmarking 

Array.from(document.querySelectorAll('a[href]')).forEach((el) => {el.setAttribute('href', el.href + '#foobar')});  // modify the dom

performance.mark('test-end'); // just for benchmarking 

performance.measure('test-duration', 'test-start', 'test-end').duration 
// 1 ms on this page, with ~200 links, and no filtering out of "matched" links.

Note that this is the least performant version (also contrived) and it's still only 1ms. Making it non-blocking might increase it by ~20%, but would reduce any UX impact on users.

Not modifying the dom: What we could do is simply parse any a[href] links and add those to the dynamicRules to get a "jump" on any links a user might click. This doesn't solve the "crux" of the MV3 problem, but it should help mitigate the IPFS Ecosystem UX impact.

I think a simple update on page load could help reduce cache misses by pre-loading appropriate URLs with:

Array.from(document.querySelectorAll('a[href]')).forEach((el) => {updateRuleForPotentialIpfsLink(el.href)});  // calls to updateDynamicRules if appropriate

Caching

There does seem to be a 2q impl in javascript, but it's 4 yrs old and stale: https://www.npmjs.com/package/2q-cache

The only package I found that mentions LRFU is weak-lru-cache, which is ESM, typescript, and fairly active: https://www.npmjs.com/package/weak-lru-cache

NOTE: weak-lru-cache supports pinning to the cache:

The expirationPriority can also be set to -1 that the value should be pinned in memory and never expire, until the entry has been changed (with a positive expirationPriority).

Metrics

Answering some of my own questions w.r.t. https://github.com/ipfs/ipfs-companion/issues/1127: Once we migrate to MV3 I think we should add some additional metrics for:

counts of urls requested that could have been loaded via IPFS but wasn't because it didn't exist in cache/dynamic rules -- RESULT: MV3 caused degradation in an experience that would have otherwise been redirected.
- Counts of urls added via calls to updateDynamicRules that were never triggered/used again -- RESULT: MV3 caused large degradation in IPFS ecosystem experience
- counts of urls added via calls to updateDynamicRules that were triggered/used again -- RESULT: MV3 caused a light degradation to this users experience.
total cached URLs -- RESULT: gives us rough insight into ipfs url usage per session

whizzzkid commented 1 year ago

@SgtPooki

Was the caching strategy discussed anywhere?

LRFU with localStorage persistence is what I'm looking for. But as you have already discovered, there isn't any decent implementation for that. I think it would involve another quick spike, will add to https://github.com/ipfs/ipfs-companion/issues/1155

We could use cache for the sessionRules and leave the staple URLs as "dynamicRules"

yep!

@lidel

we have to, because we use the same Chromium build for Google Chrome and Brave

So Brave's stance is a bit confusing on this, even though they assure that MV2 will be supported even if chromium moves to MV3, how will they provide support for chrome webstore?, which will stop accepting MV2 extensions starting June? Which means even though they might "technically" support MV2, the source would only have MV3. At that point they can decide to build their own webstore (in which case we can have a separate build) or support companion out of the box, in which case we can definitely have a separate build. I am not too keen on building the detection logic ship with extension, we can keep it simple by just building a different asset which can have a pluggable logic to handle the webRequests.

In case it is useful, in new bifrost-gateway we are using a basic 2Q cache

2Q sounds interesting, looks like someone has an implementation but no active contributions.

lidel commented 1 year ago

So Brave's stance is a bit confusing on this [..]

Yes, it is as confusing as you think, maybe even worse? :) Brave == Chromium build + runtime detection to enable Brave-specific things. Always been this way.

"Technically" you may have MV2 APIs available for a while, but you won't have a store that accepts extension with them if they need to be explicitly requested via Manifest (Chrome Web Store will reject). Brave does not plan to have own store for extensions, so you are stuck with Chromium Web Store's manifest limitations. And don't be naive, if supporting APIs removed from upstream Chromium codebase is too expensive to support, Brave will make a tough call and remove them (as soon the % of extensions that need it falls below acceptable threshold, and who would use these APIs if Brave users cant install your extension from the top search result, which is Chrome Web Store?).

Things change a lot in 5-10yr timeframes, and nothing is set in stone. Not that long ago we did not have Chromium-based Edge, and initial Brave was based on a fork of Electron. At one point we had companion backed by js-ipfs with TCP transport provided by chromium.sockets from ChromiumOS. Today is not crazier than yesterday, just different.

Browser landscape changes, and Companion needs to be nimble enough to stay afloat. We need to plan accordingly, and that is why runtime detection is the only future-proof play on our end.

Creating build pipeline around temporary landscape will be a pain to maintain. Please don't complicate it beyond current state, where we have a single codebase and one Manifest for Firefox, and one for Chromium, and everything else is decided at runtime.

2Q sounds interesting, looks like someone has an implementation but no active contributions.

We could implement it by hand, by adding a second cache that follows frequency. For recency, you already have standard LRU.

SgtPooki commented 1 year ago

done criteria for minimum scoped issue:

MV3 supported extension that could be published to chrome store. We could release and have users test so we can get feedback on the experience.

worst case scenarios:

page is always loaded twice because we have no cache.

optimizations

negative cache: websites that are obviously not ipfs - we should not check again for 12 hours..
potentially allowing users to configure expected IPFS urls.

BigLep commented 1 year ago

discuss.ipfs.tech post: https://discuss.ipfs.tech/t/announcing-ipfs-companion-mv3-rc-beta/16442

BigLep commented 1 year ago

@whizzzkid : is there a new release in the beta webstore we can share here and in the discuss post?

whizzzkid commented 1 year ago

@BigLep the latest beta is live: https://chrome.google.com/webstore/detail/ipfs-companion-rc-mv3-bet/hjoieblefckbooibpepigmacodalfndh

Adding it to the main post and discuss post.

BigLep commented 1 year ago

@whizzzkid : I assume we'll do another beta release once we have everything in we expect (e.g., updated docs, MV2 to MV3 migration, metrics). I'll try using the beta again at that point.

BigLep commented 1 year ago

@whizzzkid : from a sequencing regard, it may be advantageous to do https://github.com/ipfs/ipfs-companion/issues/1227 sooner than later as this one may take more time to validate that it's working as expected.

ipfs / ipfs-companion