Exposing additional cookie data and metadata to websites?

bsittler commented 7 years ago

Current APIs for reading cookies have varying levels of data and metadata exposure.

Level 1 "browser extension". Current browser extension APIs expose path, domain, expires, secure flag, and httponly flag to readers/watchers, and allow extensions to read httponly cookies and determine causes and fairly exact timing of evictions using change-monitoring APIs. Failed writes may result in exceptions or callbacks indicating the reason the operation failed.

Level 2 "web server". On the other hand, current HTTP Cookie request headers reveal only name and value for all in-scope cookies at a particular earlier instant in time unknown to the server, serialized in a conventional order (varying only slightly across commonly used browsers) allowing some degree of intelligent disambiguation when name shadowing occurs at different in-scope domain+path+name combinations. There is no mechanism beyond cross-request interferometry and inference for determining whether an operation failed, and no indication of why.

Level 3 "HTML". Reading document.cookie reveals (at least in most browsers) the non-HttpOnly subset of the information revealed to web servers by Cookie headers at some unknown instant after the start of the property access and before the returned serialization is used. Power-inefficient polling change monitors can reveal more timing information but only with relatively low precision. There is no mechanism beyond multiple-read interferometry and inference for determining whether an operation failed, and no indication of why.

Where is this API intended to fall relative to these existing interfaces?

If it reveals more data and/or metadata than document.cookie, is that fact spelled out explicitly in the description of the API? What are the privacy and/or security implications of the chosen level?

Is there a way for user agents to withhold the additional information, possibly later revealing it after e.g. a user grants permission to do so to a particular website? If so, is there a way for websites to determine how much additional permission/which (meta-)data they have access to?

Will e.g. a non-HttpOnly cookie being replaced by an HttpOnly cookie be distinguishable from expiration?

Will time-based expiration be distinguishable from expiration due to overwriting?

Will cookie storage quotas be exposed at all? If so, how?

bsittler commented 7 years ago

Also, given the varying time resolution of observation opportunities in those APIs, will change observations ever be quantized, batched, rate-limited or coalesced?

For instance, will a cookie briefly being expired/deleted but then being overwritten by a value identical to the value prior to the expiration always cause an event to fire?

Will all intermediate states since the previous event be reported, or only the most recent one? Will change observations be observable in the same order across tabs?

Will the same changes be reported consistently in all tabs using this API?

How much latitude does a browser implementing this API have to e.g. reduce power, memory and CPU consumption by rate-limiting and coalescing notifications?

Will multiple synchronous writes using document.cookie in a single event-handling context all be reported, or only the net effects of the sequence of writes?

patrickkettner commented 7 years ago

Where is this API intended to fall relative to these existing interfaces?

I guess level 0 - the same as document.cookie

If it reveals more data and/or metadata than document.cookie, is that fact spelled out explicitly in the description of the API?

it mentions (or at least did mention, writing on mobile and from memory) that the data available in an event is what is provided via document.cookie. The additional attributes that exist on the Cookie interface are meant for the set, rather than the get.

What are the privacy and/or security implications of the chosen level?

identical to the ones given with todays existing apis

Is there a way for user agents to withhold the additional information, possibly later revealing it after e.g. a user grants permission to do so to a particular website?

No, and I am not sure why there ever would be. Could you elaborate?

Will e.g. a non-HttpOnly cookie being replaced by an HttpOnly cookie be distinguishable from expiration?

no, given the current document it would not be, nor can I think of a reason as to why it should be. Can you (i.e. why would that information ever be useful in a script context)

Will time-based expiration be distinguishable from expiration due to overwriting?

Yes. This is covered by the ChangeCause section. In your case, the former would be expired, and the later would be an overwrite

Will cookie storage quotas be exposed at all? If so, how?

No, as this is just sugar for the existing APIs and an event. Adding quota management is outside the scope of this, at least for level 1.

For instance, will a cookie briefly being expired/deleted but then being overwritten by a value identical to the value prior to the expiration always cause an event to fire?

Yes. It would be an expired event, followed by an overwrite event. If you mean something like document.cookie="foo=bar;expires=Thu, 01 Jan 1970 00:00:00 GMT" && document.cookie="foo=bar;expires=Thu, 31 Dec 2970 00:00:00 GMT"; I can see the case for coalescing the events to a single one, but it would be a break from the existing implementations from the extension cookie.onchange event. @arronei, thoughts?

Will all intermediate states since the previous event be reported, or only the most recent one?

As above, I believe all events should fire at all times. I don't think the UA should attempt to be smart with these changes. But, with all things, am more than willing to change my mind.

Will change observations be observable in the same order across tabs? Will the same changes be reported consistently in all tabs using this API?

Absolutely. That is essentially the point of this spec.

How much latitude does a browser implementing this API have to e.g. reduce power, memory and CPU consumption by rate-limiting and coalescing notifications?

I haven't put thought into reduced power mode, since the UA's respective internal data structures for the cookie are only updated when they have chosen to update them, at which point the event listener should fire. This isn't like a timer, where a request for something to happen at after a given number of milliseconds makes sense for the UA to be able to throttle for enhanced perf or battery management, ala the 4ms minimum timer. Instead, this event is a direct reaction to the cookie being modified. Your question feels like asking "should a browser coalesce navigation events", there is explicit state at every step of the cookie modification that may or may not be important to the developer.

Will multiple synchronous writes using document.cookie in a single event-handling context all be reported, or only the net effects of the sequence of writes?

Well, any event that is fired has an individually modified cookie. that is to say, if you did document.cookie="foo=bar";document.cookie="baz=qux", there would always be two change events, since two separate cookies are being modified. The only way in which this would be a thing would be if something did `document.cookie="foo=bar";document.cookie="foo=baz", where they modify the same cookie multiple times in a single step. I could understand the appeal of combining the changes, but thats a extra amount of complexity to the design (namely having to report multiple causes for the change) for a case that seems extremely in the corner.

bsittler commented 7 years ago

Exposing ChangeCause is indeed new information not previously available to HTML applications' scripts. I don't know whether revealing this new information is a security or privacy problem, though. How would user-initiated cookie jar modifications be reflected in ChangeCause? Would it be more conservative to omit ChangeCause entirely for now? What purpose is it intended to serve?

Given that you're otherwise intending to reveal only the information equivalent document.cookie, most of my remaining questions are not relevant - for instance, multiple same-named cookies in scope for a particular web page URL would presumably appear during iteration with only "name" and "value" fields, and the only way to disambiguate between these differently-scoped cookies (scope being the combination of implicit or explicit domain with path) would be by their position in the iteration order.

I asked about power-efficiency because expirations and cookie operations in other tabs may cause an otherwise-rate limited background tab to wake up, and because notifying all interested tabs of the same sequence of changes in the same order may cause a much larger number of event notifications, more script execution, and more power consumption than would be used if the background tab received only rate-limited deltas relative to the last cookie jar state communicated to that tab.

This full "transaction log"-level change notification fidelity is also revealing more information to websites than was available to them via document.cookie polling, whereas rate-limited snapshot deltas (potentially with different tabs getting different snapshots due to varying timing even with the same URL and interests) is exactly what websites using timer-driven document.cookie polling get today.

patrickkettner commented 7 years ago

Exposing ChangeCause is indeed new information not previously available to HTML applications' scripts. I don't know whether revealing this new information is a security or privacy problem, though.

neither do I, but will certainly have security folks take a look before shipping.

How would user-initiated cookie jar modifications be reflected in ChangeCause? Would it be more conservative to omit ChangeCause entirely for now? What purpose is it intended to serve?

Could you clarify what you mean by user initiated cookie jar modifications? Would that be when a person clears out the cookies via the UA's settings? If so, it would fall under eviction but that could do for an improved definition.

... is exactly what websites using timer-driven document.cookie polling get today.

you are making a fairly broad assumption on the level of polling, and how individual UAs choose to clamp timers. "closer to what websites using timer-driven" might be accurate, but without data backing that up I am extremely hesitant to agree. And even if we do attempt to emulate that environment, there is no guarentee that what coalesced event that is fired would match up to polling timers. I see a lot of complexity and a less clear interface for developers for the cost of potentially having a very small impact on battery life (several battery events will be orders of magnitude easier on the battery than polling).

bsittler commented 7 years ago

What I mean by user initiated cookie jar modifications are users deleting specific individual cookies using the browser's cookie jar editor, or deleting a specific site's cookies using browser UI, or clearing all cookies for all sites using browser UI.

patrickkettner commented 7 years ago

currently, all of those would be treated as evicted. You highlight a good use here for a single "everythings gone" ChangeCause to be used when the UA or users decides to delete the entire jar at once.

patrickkettner commented 7 years ago

closing as the ultimate point of the thread is quickly summed up in #38

patrickkettner / cookie-change-events

Exposing additional cookie data and metadata to websites? #36