hypothesis / support-legacy

a place for tracking support-related work and projects
3 stars 0 forks source link

website in Via3 forwards the user if they click on the text; this behavior is not present in Via #157

Closed mkdir-washington-edu closed 3 years ago

mkdir-washington-edu commented 3 years ago

Describe the bug When a particular site, www.edutopia.org, is loaded in Via3, clicking on the text of its articles forwards the user on to https://www.edutopia.org/. This doesn't happen in Via.

To Reproduce Steps to reproduce the behavior:

  1. Go to https://viahtml3.hypothes.is/proxy/https://www.edutopia.org/article/key-aspects-play-early-education
  2. Click in the white space near the title or a picture -> you are not forwarded.
  3. Click in the white space in the article text -> you are forwarded to https://viahtml3.hypothes.is/proxy/https://www.edutopia.org/
  4. Repeat the same steps in https://via.hypothes.is/https://www.edutopia.org/article/key-aspects-play-early-education. -> you should not be able to reproduce the error.

Expected behavior Since the original site (https://www.edutopia.org/article/key-aspects-play-early-education) doesn't forward you on when clicking in article text, Via3 should not as well.

Desktop (please complete the following information):

Additional context From: https://app.hubspot.com/contacts/6291320/ticket/235967150/

jon-betts commented 3 years ago

This is very weird. I can't see why we'd take the whole page to be a link. Some random guesses:

In general, it's probably worth mentioning that:

robertknight commented 3 years ago

Here is what I found after some debugging:

There is a subtle incorrect behavior in pywb's override of Element.prototype.getAttribute. When called with the attribute name href on an element that has no such attribute, the override returns a different result than the real API.

On the original page:

document.body.getAttribute("href")
> null

On the proxied page:

document.body.getAttribute("href")
> ""

Note that document.body can be substituted with any element that has no href attribute.

The client-side code in this page is sensitive to this difference and interprets the non-null result as an instruction to jump to the page with that relative URL.

The problem goes away if getAttribute is "fixed":

Element.prototype.getAttribute = function (attr) { return this.getAttributeNS(null, attr) }

As long as we're using pywb, this specific issue needs to be reported and fixed upstream.

robertknight commented 3 years ago

To summarize my lengthy comments above, I think the next steps here are:

  1. To report the issue I described above upstream. I believe it comes from the Wombat repository.
  2. To upgrade pywb when a new version becomes available with the fix.

Since we don't know how long (2) is going to take, I suggest we consider the issue blocked and take it out of the current sprint once (1) is complete and we have Dependabot set up to notify us when the new pywb version becomes available.

robertknight commented 3 years ago

I have proposed a fix for the issue upstream in the "Wombat" component of pywb. We're waiting on the fix to be reviewed/merged and shipped in a new release of pywb before we can close this issue locally.

robertknight commented 3 years ago

The PR I posted for Wombat was included in the v3.0.3 release (https://github.com/webrecorder/wombat/commits/main - the release doesn't have a tag unfortunately) which was in turn included in pywb 2.5.0. So if we update the version of pywb in ViaHTML to >= 2.5.0, that should resolve the issue.

robertknight commented 3 years ago

I have enabled Dependabot for ViaHTML (Slack thread). Hopefully that should send us a PR to update pywb in due course.

robertknight commented 3 years ago

I tried to upgrade pywb from 2.4.2 to 2.5.0 but we had to revert due to the new version triggering an exception in some of the code in viahtml that hooks into pywb. See https://github.com/hypothesis/viahtml/pull/57 for details.

robertknight commented 3 years ago

The pywb update to v2.5.0 has been successfully deployed.