Open Rob--W opened 2 years ago
Similar proposal: https://github.com/w3c/webextensions/issues/58
I would be in favor of this for Safari.
In favour of this proposal, in addition to the consistency issue that was pointed out in the meeting (ie if the extension and browser have a different version of the PSL, there's a potential for security issues), some other arguments for having this exposed as an API:
I'd be favor of this for Chrome.
To proceed with this issue, we need a more concrete proposal. Some points that a more fleshed out proposal would address include:
I'll follow up to see if Mozilla's multi-account-containers maintainers want to put up a proposal for the api shape here.
@oliverdunk is going to reach out to PSL maintainers to inform them of our intent to offer this API.
some other arguments for having this exposed as an API:
Just to reiterate the points from my original request (#58) which covers many of the same arguments:
Issues with the current approach include:
- Needing to add a ~100KB data file to the extension. (See this file I've used as an example: https://github.com/AiondaDotCom/trashmail-addon/blob/master/public_suffix.json)
- Needing to pre-process the file to allow the search to be efficient. (For example: https://github.com/AiondaDotCom/trashmail-addon/blob/master/update_suffixes.py)
- Needing to write an efficient algorithm for searching the dataset in order to avoid slowing down page loads (which many developers probably wouldn't know how to do). (For example: https://github.com/AiondaDotCom/trashmail-addon/blob/master/publicsuffixlist.js)
- Needing to update the extension regularly just to keep the suffix list up-to-date.
- As there are "naughty" words in the suffix list, this tends to trip automated checks and forces the extension to have a manual review.
This issue is to track use cases and the desired shape of the API.
As per the title of my original request, I think the most common scenario is to get the organisational domain, rather than the suffix alone. So, to save some manual string manipulation, it would be great for the API to include a function to get the organisational domain.
I have started an email thread with the maintainers of the PSL - will keep this thread updated.
Such a proposal should contain some guidance on what to do when the result changes, or at least a warning. This is a rare event so it is prone to getting overlooked.
As mentioned in a recent meeting, I met with Simon Friedberger (Mozilla) and Simone Carletti, both PSL maintainers. They were generally very supportive and would like to see this API. We agreed introducing an extension API is unlikely to generate a significant number of additions to the list, since developers are already using the PSL in other ways today, but that while volume is not a concern any education to maintain submission quality would be appreciated.
We also discussed several practical thoughts on the API signature / functionality:
Just as an idea for the API ....
Last year, I wrote a pure JavaScript PSL (Public Suffix List) module.
I looked at similar available methods in order to base the properties on.
The module outs 4 values of subdomain
, domain
, sld
, & tld
.
Keeping the list up to date in browsers is important.
I would assume this part is already being done (maybe not in all browsers though..)? e.g. Firefox shows passwords on mail.google.com that were created at calendar.google.com or similar. I assume they must be using the PSL for such functionality.
Last year, I wrote a pure JavaScript PSL (Public Suffix List) module.
I don't think that code is correct (it doesn't appear to handle !
or *
rules). There are several other examples online too.
Here's one I've done based on an existing solution, which in theory, should be a lot more optimised for performance: https://github.com/AiondaDotCom/trashmail-addon/blob/master/publicsuffixlist.js
But, does require preprocessing the list to a more optimal format for querying first: https://github.com/AiondaDotCom/trashmail-addon/blob/master/update_suffixes.py (Result: https://github.com/AiondaDotCom/trashmail-addon/blob/master/public_suffix.json)
I would assume this part is already being done (maybe not in all browsers though..)? e.g. Firefox shows passwords on mail.google.com that were created at calendar.google.com or similar. I assume they must be using the PSL for such functionality.
All browsers include the PSL (it is required for things like cookie handling), but updates aren't necessarily as frequent as would be ideal. I can only speak for Chrome where I understand it is currently a manual process we run every ~6 months.
On the Firefox side, each build ships with a copy that is up-to-date at time of build, I believe. The update process for the source code is automated, cf. the commit log for the data file: https://hg.mozilla.org/mozilla-central/log/tip/netwerk/dns/effective_tld_names.dat . There was some attempt in the past to be able to update out-of-release-band (like safebrowsing and other similar services that update more frequently than the standard release cadence) but I think that stalled once we hit issues with how this changed origin parsing/serialization (and doing so while a multi-process browser is running while keeping all the processes aligned on that change is... not trivial). Cf. https://twitter.com/ValentinGosu/status/1510295473864728581
From my side maintaining a list in an extension, I'm satisfied if I remember to update once a year, so even a 6 monthly update seems good to me and a big improvement.
From my side maintaining a list in an extension, I'm satisfied if I remember to update once a year, so even a 6 monthly update seems good to me and a big improvement.
It also saves you from pushing it to all the addons' stores. Granted that most of the time it's a minor change that gets quickly reviewed and accepted. However, there's always a risk things takes more time or something else.
With a builtin API, we're just giving more peace of mind to the addon developers :).
FYI I asked the contributor who submitted a patch to Firefox before whether they're interested in creating a proposal according to our proposal process: https://bugzilla.mozilla.org/show_bug.cgi?id=1315558#c27
The public suffix list is a database of effective top-level domains (eTLD), which are the public suffix of URLs. This database is included in browsers (at least by Firefox, Chrome and Safari - sources below) and may be updated remotely. There have been feature requests for an API that allows extensions to identify the public suffix (eTLD) of a given URL:
There are solve known problems with the application of public suffix list (https://github.com/sleevi/psl-problems), but that does not necessarily rule out an extension API with such access. Extensions that need to rely on the public suffix list currently need to rely on alternatives, such as bundling the database with the extension, at the risk of having incompatible interpretations of the "public suffix of a URL" between the browser and the extension. With proper documentation of the problems associated with the public suffix list, extension authors can make a conscious decision to use the API when they need to.
This issue is to track use cases and the desired shape of the API. For example, the following would be the minimum:
Here are other examples of APIs to query the public suffix: