WICG / cookie-store

Asynchronous access to cookies from JavaScript
https://wicg.github.io/cookie-store/
Apache License 2.0
144 stars 35 forks source link

Domain Attribute Validation, Departure from RFC6265bis #234

Open dpchamps opened 3 months ago

dpchamps commented 3 months ago

Greetings folks 👋

Ran into this error path today, initially I thought it was a bug / oversight in the chromium implementation but as I dug into it more it seems like an explicit decision made by the authors of this specification.

RFC 6265bis Section 4.1.2.3 states,

(Note that a leading %x2E ("."), if present, is ignored even though that character is not permitted...

It looks like this part of the spec is explicitly ignored in web-platform-test suites, and the leading . is treated an an error path:

That leads me to believe it was a conscious decision.

I can not find any discussions on why that decision was made. My gut reaction is that this validation should not take place and depart from the specification. Happy to provide a more compelling argument if yall are willing to entertain it, but first wanted to know the history behind the decision.

annevk commented 3 months ago

Syntax decisions in the RFC have no bearing on API decisions here. It's pretty common for syntax to silently ignore something that a corresponding API rejects. You'll need a more compelling argument.

dpchamps commented 3 months ago

Absolutely! I wasn't expecting you to accept it at face value. I was hoping to understand the decision first -- I'm willing to accept that my reaction is wrong.

I see this called out in the chromium codebase:

// The leading dot (".") from the domain attribute is stripped in the // Set-Cookie header, for compatibility. This API doesn't have compatibility // constraints, so reject the edge case outright.

Does that represent the thought process, or was there more to it?

annevk commented 3 months ago

That seems right. Not allowing it also makes it clearer to web developers that there's no meaning tied to the leading dot, hopefully encouraging them to slowly remove it everywhere.

Aside: the document does seem to contain a mistake in referring to the dot as U+002D rather than U+002E.

inexorabletash commented 3 months ago

Not allowing it also makes it clearer to web developers that there's no meaning tied to the leading dot, hopefully encouraging them to slowly remove it everywhere.

That matches what I remember of our internal discussion - error early rather than silently ignore, since we don't have compat constraints with a new API.

Aside: the document does seem to contain a mistake in referring to the dot as U+002D rather than U+002E.

Oops, good catch.

dpchamps commented 3 months ago

Cool thanks! Apologies in advance for writing a book.

Let me try summarize the strongest version of the argument as I understand it:

It is a good thing to throw an error when a leading . character is present in the Domain Attribute. There are a few primary motivations to fail fast and explicitly:

  1. The leading dot has no significance in modern browsers. Developers should know that it is meaningless, and the best way to signal that is to explicitly throw the error.
    1. The API should throw as a forcing function to motivate the removal of leading dots in cookie Domain Attributes
  2. The leading dot is not technically permitted by the grammar of the cookie RFC, but nonetheless it is tolerated due to historical constraints. Namely: the leading dot was once significant, but that significance was superseded by later revisions. The Cookie Store API is net-new, and therefore is not beholden to the backward compatibility expected in other APIS.
  3. Finally, just because the cookie RFC specifies a more permissive syntax, the API is not precluded from specifying a less permissive syntax in order to enforce expected and correct results.

What is the least surprising behavior?

Given the above argument, it could be interpreted that throwing produces less surprising results. That's to say: silently eliding the leading dot is surprising behavior that the API should not engage in.

However, I claim that the opposite is true. Given that existing apis in the browser and back-end permit the leading dot, the restriction in the current api forces developers to be cognizant of a difference in this api as opposed to other apis for effectively the same operation. That satisfies criteria for surprising behavior: the spec permits it, most existing apis permit it, yet the new api doesn't.

Note: this is not an appeal for behavior parity with document.cookie or [insert other api that does it differently]. Instead, I want to frame it under the principal of least astonishment. Given that A) there is no error condition associated with the leading dot, and B) that the underlying specification explicitly defines what to do with the leading dot, throwing an error in the API is counter intuitive to the way most users expect the API to behave.

What is the role of data validation?

Generally: error paths in data validation should be reserved for unexpected inputs that could potentially lead to unexpected results. This is captured nicely in the following issues where the spring framework authors are reverting the same kind of validation:

Summarized nicely:

...we should be strict with output and lenient with input.

That principal is applicable to the conversation at hand as there is no immediate threat of unbounded input from the leading dot. In this case, we know explicitly what to do with the leading dot, because the specification anticipates it.

It is inappropriate to use an error to correct potential misunderstandings of generally accepted syntax particularly when that misunderstanding has no net negative impact to performance, security or user experience.

Given that the API is surprising without a strong justification for why it should be surprising AND ALSO in the absence of any real error, the validation is merely inconvenient to developers that are attempting to adopt the new API which has the following stated goal of replacing document.cookie

Suggestion is to remove validation for the leading dot in order to provide a more robust API that users find intuitive.

Aside (anecdotal): We ran into this at eBay as we are authoring async libraries for cookie manipulations. Users are being caught off guard while we begin to support both apis and aim at transitioning as more browsers gain support for Cookie Store. It's generating a bit of churn. We want to limit the surface area of changes necessary for adoption as our developers begin to incrementally adopt.

annevk commented 3 months ago

It's not at all clear to me that the leading dot does not result in confusion. For that we'd need to understand why folks are adding it in the first place.

dpchamps commented 3 months ago

Addressing the claim directly,

Allowing the leading dot may result in confusion.

  1. Users expect the leading dot to be permitted irrespective of the new CookieStore api.
    1. The browser is only a small subset of how developers will interact with cookies
  2. When a user wants to know something about cookie implementation details, they don't look at the docs for CookieStore, or CookieStore internals. As is called out on the README of this repo, "The best resource for understanding the deep technical aspects of cookies is the most recent draft of RFC 6265bis."

That permitting the leading dot might generate confusion is a moot point. The state of the world as it is today is that all major cookie apis accept the leading dot because it is specified in the underlying cookie RFC.

From an API design standpoint, I will push back: we do not need to understand the motivations at all. Instead, please consider more deeply the arguments provided above:

We must understand how the new API exists in the current ecosystem of technologies, and how potential users expect the new API to function in relation to these existing technologies.

I propose that a leading dot is generally accepted syntax -- which also happens to be supported by the Cookie RFC. Departing from that generally accepted syntax should require strong justification, evaluated under some criteria of known real world performance degradation, potential threats to security or known user error conditions.

In the absence of that strong justification, the API is simply different (which is surprising in a bad way) and inconvenient by applying unnecessary data validation to restrict generally accepted syntax.

annevk commented 3 months ago

Is there some way in which you are calling this API that makes this requirement prohibitive?

dpchamps commented 3 months ago

The argument as presented has yet to be acknowledged. It's a bit difficult to gauge whether or not it's being received in earnest.

As I already mentioned, eBay developers ~found~ are finding the validation requirement confusing during integration. Devs are working around it -- mostly because in a bigger ecosystem they may not have the privilege of directly changing these base values: they exist somewhere else.

For the integration, devs either have to perform the coercion that the api should be providing to drop the leading dot, or they have to go change the value, which may be owned by a separate team and requires additional validation checks (even if we all agree that the leading dot does nothing), which boils down to additional time. Also, it's not great that we have to reach out to these other places and make changes to adopt a browser api.

So in practice, cookie store is being wrapped in another api to handle the leading dot. I don't want to see this wrapper. And I don't think it's the intent of authors to have this kind of wrapper propagate.

You can see other arguments from devs in the wild in the issues I linked above.

Just want to reiterate: the leading dot currently exists and is used. As long as the specification permits it, it's likely to continue to exist irrespective of API decisions here.

annevk commented 3 months ago

I don't think I'm persuaded as a wrapper is not that hard to write and there is benefit in teaching people that a trailing dot doesn't help with any kind of scoping as one might imagine.

And generally for modern formats and APIs the predominant principle is to not be lenient when it comes to mistakes.

dpchamps commented 3 months ago

And generally for modern formats and APIs the predominant principle is to not be lenient when it comes to mistakes.

It's only a mistake if you reject the decision behind the semantics and algorithm for parsing the domain attribute definition outlined in the RFC. Which as I understand it is guided by the acknowledgment that setting a cookie may not fully originate from one "source" (e.g. to facilitate interoperability)

If you think the RFC is wrong, the place for the decision arguably lies there.

Syntax decisions in the RFC have no bearing on API decisions here.

They should where possible, for general coherence.

Finally, in reference to the predominant principles for modern formats. Surely the api design decision has failed when the recommendation is, "it's preferable to use a wrapper to interact with our api to form a specification-valid construction." Basically acknowledging that it's unusable, but an acceptable sacrifice on principle.

The RFC acknowledges this,

User agents use this algorithm so as to interoperate with servers that do not follow the recommendations in Section 4

And as called out above in the use-case, the same applies for CookieStore. Clients are not integrating in a vacuum, regardless of perceived absence of compatibility constraints.

annevk commented 3 months ago

The RFC clearly says it's a mistake. It's a mistake that's ignored, but a mistake nonetheless.

And yes, you might need a wrapper if you have a lot of legacy code passing around domains with leading dots.

dpchamps commented 3 months ago

The argument boils down to, "the api should be more difficult to use in real world scenarios because of some unspecified appeal to modern api principles."

Which is, IMO an unacceptable stance from standards authors and also not in the spirit of the web.

But you've spoken, and have ruled no. So I think this issue is settled. Thanks for the convo.

annevk commented 3 months ago

That's certainly an interpretation. I'd be happy to leave this open to see if other people are also running into this. I think there's certainly a threshold where leniency could be the answer.