Improvements to the LinkExposure module and the linkExposure, ampResolution, and utils content scripts

LinkExposure Module
- [x] The regular expression logic (including createRegex, setRegex, and setCode) is unnecessarily complicated. It can just be ~4 lines in runStudy, using the functionality that already exists in the Matching utility module.
- [x] Nit: the following error message should be updated for this module: "Warning: unexpected page content update"
- [x] There some unnecessary repetition of debugLog and storage.set calls in the message listener. Better to have one function that handles creating the object to store, storing the object, and incrementing the counter. See the createShareRecord function in the SocialMediaSharing module and the suggestions for improving it in #54.
- [x] The message passing between the content script and the background script should use arrays of links, rather than one link per message. Sending a separate message for each link is quite inefficient, especially since the links will naturally come in sets with timeout intervals.
- [x] The exposure event object that's stored should be a defined type with JSDoc (e.g., a LinkExposureEvent type).
- [x] The current exposure event object contains a link object, which then has href, dest, and error properties. That's an odd structure. It would be cleaner to just have separate properties within one object, like referrer (a string), originalUrl (a string), isShortenedUrl (a boolean), resolutionSucceeded (a boolean), and resolvedUrl (a string).
- [x] The getStudyDataAsObject function needs to be revised to reflect what's now in storage.
linkExposure Content Script
- [x] We should have a visibility size threshold in addition to a visibility time threshold.
- [x] The linkExposure function is unnecessary. All the code it contains can go in the outer anonymous function, and you can just mark that function as async.
- [x] Nit: the definition of initialVisibility should just use the isDocVisible function.
- [x] Nit: use more descriptive names for sdrs, drs, ret, and res.
- [x] The checkPrivateWindowSupport function seems unnecessary. That logic is only used once and can be more concise in the main function.
- [x] The init function also seems unnecessary. Same rationale.
- [x] You can store RegExp objects in extension local storage. That's more efficient than storing strings and than creating those objects from the strings.
- [x] The JSDoc for sendMessageToBackground should specify what's in the data object.
- [x] Update the sendMessageToBackground function (and related logic) to make the link a string rather than an object and send an array of links rather than one at a time. More on this issue above.
- [x] Nit: the documentation for matchUrl trails off and is ambiguous.
- [x] Nit: the observeChanges function has a misleading name, since it isn't a handler for Intersection Observer events. Something like checkLinksInDom would be more descriptive.
- [x] You can avoid avoid the filter call (which requires a second linear pass over the link elements) by using querySelectorAll (specifying a tags that have href attributes) rather than getElementsByTagName.
- [x] The processElement function seems unnecessary. That logic is used just once and can be pulled into the observeChanges function.
- [x] If the URL on a link element doesn't match a domain of interest, we should save that to avoid checking it again later.
- [x] The ElementStatus class seems unnecessary, since all the logic is only used once. Instantiating so many class objects could also magnify any performance hit on websites with lots of links. A cleaner approach would be to just use objects with properties that are built-in types, and then have logic in observeChanges and handleIntersection that operates directly on those properties.
- [x] The UpdateHandler class also seems unnecessary. All the logic could go in the main anonymous function and observeChanges.
ampResolution Content Script
- [x] Since this is currently only used in the linkExposure content script, all the code should be migrated into there. That also provides function encapsulation.
- [x] Nit: the top-level JSDoc isn't quite right, since this isn't a module. Also includes the wrong file name.
- [x] The domRegex string isn't documented, the number of forward slashes seems like it should be fixed at two (for what you're trying to do), and the g and m modifiers seem unnecessary. See the regular expression logic in the Matching module for a template.
- [x] The getDomainPrefix function is unnecessary, since it's only used once and the logic could be easily folded into resolveAmpUrl.
- [x] The resolveAmpUrl function currently requires three passes over each URL string. You could cut that down to one pass by compiling the domains into a URL matching regular expression in the background page, saving that in extension storage, and then using that regular expression in the content script.
- [x] Nit: please clearly document the two components of decoding an AMP URL, extracting the source domain name and extracting the source URL. Please also include links to the AMP documentation that you're relying on, and include examples for each component of decoding.
- [x] It doesn't look like the URL path decoding logic is quite right. Please check carefully against the AMP documentation, and please check against example URLs from all the major AMP caches.
- [x] Nit: the return value of resolveAmpUrl is undocumented.
utils Content Script
- [x] Since this is currently only used in the linkExposure content script, all the code should be migrated into there. That also provides function encapsulation.
- [x] Nit: the top-level JSDoc isn't quite right, since this isn't a module. Also includes the wrong file name.
- [x] Nit: use a more descriptive function name for rel_to_abs, and provide documentation throughout the function.
- [x] Nit: use a more descriptive names for fbShim, u, hasU, fbRegex, and isFb.
- [x] Nit: add function documentation for fbShim, isFb, and removeShim.
- [x] Nit: add documentation for fbRegex.
- [x] Nit: instead of using a const and an arrow function to define functions, just use named functions for clarity. For example, instead of const isFb = url => { you can just say function isFB(url) {. Same for isElementVisible.

citp / news-disinformation-study

Improvements to the LinkExposure module and the linkExposure, ampResolution, and utils content scripts #57