citp / news-disinformation-study

A research project on how web users consume, are exposed to, and share news online.
8 stars 2 forks source link

#33 Engineering Review - First Pass #43

Closed biancadanforth closed 4 years ago

biancadanforth commented 4 years ago

First pass engineering review is below. Thanks for the great README and code comments.

Note: The eng-review-1 branch is empty except for the first commit, which we can safely ignore since it was WE template code. This approach allows me to review the code that has already landed in master.

jonathanmayer commented 4 years ago
  • I have looked at the modules that are actually loaded currently by study.js (Navigation.js and LinkExposure.js). Are all the modules in this repo planned to be used in this study?

Yep.

  • I profiled this extension running on cnn.com compared to a normal page load of cnn.com. I haven't seen anything terrible so far, though this is probably not a worst case scenario. @jonathanmayer , what would be a worst case scenario?

Best guess of the worst case scenario: pages with many links that involve shorteners (e.g., a Twitter feed). It's possible that iterating links on the page (the LinkExposure module) and resolving those links (the LinkResolution module) could get slow.

  • Decide what you want to collect specifically in pageContent.js, and do that work in the content script.

    • This should translate into a much smaller and more focused amount of data.

We implemented the functionality to save page HTML in the Navigation module because the Stanford research team specifically requested it. We don't plan to use it ourselves, and I agree that it's not a good approach.

  • [ ] Remove any temporary code (e.g. downloading through the browserAction toolbar button).*

    • If some of this code is still helpful for a while yet, we can postpone this to a later review.

Added as #44.

  • [ ] Determine how the extension should behave in private windows and implement that behavior.*
  • [ ] Implement consent UI.*

*: As long as there is an issue tracking this, so that it's not forgotten, that's fine for now. Just let me know what the issue numbers are.

Private Windows: #45 Consent UX: #4

  • If all modules (e.g. Link Exposure, Navigation, ...) in the study use the same set of subdomains and domains, perform the matching work only once and in the background script.

While we (at least currently) plan to use the same set of domains for each study module, we'd like to support the use case where the domains differ by module. And down the line, we'd like to support multiple instances of each study type. Perhaps the thing to do is cache output from the Matching module?

  • Implement a solution for keeping track of links you have already seen that does not require modifying the page, and ideally, does not require using setInterval.

    • I describe a few options, though I recommend the Intersection Observer API if possible.
    • If you do keep setInterval, there should be a way to cancel it.

Added as #46.

biancadanforth commented 4 years ago

First Review Status: Yellow

This status is based on my first review, the changes I have seen pushed (as of the latest master 98d0fb8) and @jonathanmayer 's overall reply.

Once all blocking issues are resolved, we can close this PR. I will open a new PR for the next round of review.

Blocking Issues

Unresolved blocking issues:

Resolved (or tracked) blocking issues:

Things I plan to look at in next round of review

jonathanmayer commented 4 years ago

So we can close out the initial review, I've pulled the functionality for storing page content. Tracking in #53 if the Stanford team (or anyone else) wants to revisit the feature.

biancadanforth commented 4 years ago

First Review Status: Green

This status is based on my first review, the changes I have seen pushed (as of the latest master aa789a2) and @jonathanmayer 's overall reply.

Blocking Issues

Unresolved blocking issues: None

Resolved (or tracked) blocking issues: