satoshi-takano / OpenGraph

A Swift wrapper for the Open Graph protocol (OGP).
190 stars 59 forks source link

Medium links do not follow redirect #43

Closed peterfriese closed 4 years ago

peterfriese commented 4 years ago

When trying to parse OG meta data from Medium articles shared from within their iOS app, OpenGraph returns an empty result.

URLs shared form their app look like this: https://link.medium.com/oT1YJfn1G9

Retrieving this URL with curl resolves to the final URL (https://onezero.medium.com/i-bought-a-new-router-it-told-me-i-was-hacked-fb141930dd22?source=userActivityShare-ea0b1eb1f5d2-1599825776&_branch_match_id=link-832936551787943816)

curl -v https://link.medium.com/oT1YJfn1G9
*   Trying 52.8.138.103...
* TCP_NODELAY set
* Connected to link.medium.com (52.8.138.103) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use http/1.1
* Server certificate:
*  subject: CN=link.medium.com
*  start date: Aug 22 09:07:22 2020 GMT
*  expire date: Nov 20 09:07:22 2020 GMT
*  subjectAltName: host "link.medium.com" matched cert's "link.medium.com"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
> GET /oT1YJfn1G9 HTTP/1.1
> Host: link.medium.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 307 Temporary Redirect
< Server: openresty/1.13.6.2
< Date: Fri, 11 Sep 2020 12:03:58 GMT
< Content-Length: 0
< Connection: keep-alive
< X-Powered-By: Express
< Set-Cookie: _s=vulHH3zzhLh5cNMbTeZBmlRnPTWSi6J1oZt4uT835w2OYeFv9ocdeRfs7kngIVl7; Max-Age=31536000; Path=/; Expires=Sat, 11 Sep 2021 12:03:58 GMT
< Last-Modified: Fri, 11 Sep 2020 12:03:58 GMT
< Location: https://onezero.medium.com/i-bought-a-new-router-it-told-me-i-was-hacked-fb141930dd22?source=userActivityShare-ea0b1eb1f5d2-1599825776&_branch_match_id=link-832936551787943816
<
* Connection #0 to host link.medium.com left intact
* Closing connection 0
satoshi-takano commented 4 years ago

@peterfriese Thanks for your information. But I couldn't reproduce the redirection behavior and suppose that URLSession normally follow redirects automatically. I'll be able to investigate the problem if you provide the iOS version and OpenGraph version that the problem occurred.

peterfriese commented 4 years ago

When trying to fetch Open Graph data from a Medium link, Medium will send an interstitial screen which is trying to convince the user to open the respective article in their native app.

I worked around this behaviour by sending a desktop user agent when fetching the Open Graph data. Here is the code I use:

  func fetchLinkMetadata(url: URL) {
    self.logger.debug("Fetching meta data using Open Graph")

    // This header makes sure we request the desktop website, which will prevent Medium from trying to display a "open this in the app" interstitial
    let headers = ["User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.83 Safari/537.36"]

    OpenGraph.fetch(url: url, headers: headers ) { result in
      switch result {
      case .success(let og):
        if let finalUrl = og[.url] {
          self.metaUrl = finalUrl
        }
        if let title = og[.title] {
          self.metaTitle = title
          DispatchQueue.main.async {
            if self.textView.text.isEmpty || self.textView.text == url.absoluteString {
              self.logger.debug("Updating UI title: \(title)")
              self.textView.text = title
            }
          }
        }
        if let siteName = og[.siteName] {
          self.siteName = siteName
        }
        if let description = og[.description] {
          self.metaDescription = description
        }
        if let author = og[.bookAuthor] {
          self.metaAuthor = author
        }
      case .failure(let error):
        print(error)
      }
    }
  }

I've seen other OG frameworks that handle this behind the scenes. It might be worth while to:

  1. Add a note to the documentation telling people how to work around medium's limitations
  2. Or add a flag to OpenGraph that lets users choose whether they'd like OpenGraph handle this automatically for them
satoshi-takano commented 4 years ago

I found that OpenGraph hadn't handle the redirect since the interstitial screen provokes the redirection on their front-end JavaScript codes. Also, I think it's better that this OpenGraph library doesn't handle front-end redirection so that it keeps the isolation from any presentation logic. Thus I'll update README to let OG users know how to handle these kind of situations without any code changes.