veracitylab / DOM-Instrumentation-to-Display-Provenance-Data

Honours Project: JavaScript program to Instrument DOM elements manipulated by Ajax calls to facilitate in exposing Provenance Data
0 stars 0 forks source link

No robust way to map requests to modified page elements #4

Closed wtwhite closed 6 months ago

wtwhite commented 6 months ago

The problem

There is in principle no robust way to map programmatic HTTP requests or responses (XMLHttpRequest and fetch() calls) to modified HTML page elements without performing full-scale taint analysis, which is much too heavy (and maybe impossible). For example:

  1. Client: JavaScript code requests GET /first/url via XMLHttpRequest
  2.       Server: Sends the /first/url response
  3. Client: JavaScript code requests GET /second/url via XMLHttpRequest
  4.       Server: Sends the /second/url response
  5. Client: Waits 5 seconds
  6. Client: Updates a <div> element with data from the /second/url response
  7. Client: Waits another 5 seconds
  8. Client: Updates a different <div> element with data from the /first/url response

A browser extension (or in general, any JavaScript code that the existing site's JavaScript does not have a dependency on) has no way to determine that the update in step 6 came from /second/url, while the update in step 8 came from /first/url.

Existing code attempts to solve this by either:

Solutions

Given that there is no ideal way to solve the problem, there are 2 possible ways forward:

  1. Drop the requirement of showing which HTML elements were updated by a particular HTTP request, and just show, e.g., a small button that can be clicked to show a list of all HTTP responses resulting from the current page and having attached provenance data.
  2. Attempt to map provenance-enriched HTTP responses to modified HTML elements in a more defensible, but still heuristic, way.

For now, to get something going, I'll go with option 1 -- something like this is needed in any case for handling non-JavaScript-initiated requests (e.g., full page loads). #5 (EDIT: originally given below) describes a straightforward approach that could be used to implement option 2 later.

wtwhite commented 6 months ago

Closing since #6 and #7 suffice to implement option 1 in a basic form.