oduwsdl / MementoEmbed

A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (mementos).
MIT License
15 stars 3 forks source link

Assume scheme for favicon URIs that begin with // ? #91

Open shawnmjones opened 6 years ago

shawnmjones commented 6 years ago

For http://archive.is/Q8pGf the favicon URI //abs.twimg.com/favicons/favicon.ico has no schema and thus the requests library does not know what to do with it.

I'm not sure what the best expected behavior is here.

See #86.

(updated to remove typo)

phonedude commented 6 years ago

//abs.twimg.com/favicons/favicon.icohas is a scheme-less URI, it inherits the scheme of its parent URI. the point is to not have browsers freak out w/ http and https combos of embedded images, css, etc.

https://www.google.com/search?q=schemeless+url&ie=utf-8&oe=utf-8&client=firefox-b-1

it's a stupid thing that needs to die, but it will live on in archives...

ibnesayeed commented 6 years ago

If the context of the favicon (i.e., referrer document) is available then you can use the scheme from that or simply use one of the http or https`. Generally, archives are going to normalize the scheme any way (except a few).

shawnmjones commented 6 years ago

This favicon comes from the content of the URI-R, so I should be able to follow the suggestion from @ibnesayeed and just use the scheme of the referrer.

If it were part of the memento, I could just use the scheme of the original URI and use datetime negotiation to determine if the archive stored it.

To account for these, I'll have to test each favicon (and image) URI to ensure that it has a scheme and take appropriate action if the scheme does not exist.