mozilla / remote-newtab

Remotely-hosted New Tab Page
https://mozilla.github.io/remote-newtab/src/
Mozilla Public License 2.0
15 stars 7 forks source link

Adding fetching and caching of directory links. #108

Closed emtwo closed 8 years ago

marcoscaceres commented 8 years ago

@emtwo, with your permission, do you mind if I clean up a few of the asycn thing?

emtwo commented 8 years ago

@marcoscaceres feel free to make changes if you'd like and push them to this branch. This is kind of rough, I wasn't expecting a review on it so soon but thank you for looking at it :)

marcoscaceres commented 8 years ago

Ah, sorry. I'm addicted to reviewing PR + really wanted to see how the DirectoryLinks stuff works. Will send some suggestions tomorrow (it's not work on the weekend if it's fun, right?).

Sent from my iPhone

On Oct 17, 2015, at 4:14 PM, Marina Samuel notifications@github.com wrote:

@marcoscaceres feel free to make changes if you'd like and push them to this branch. This is kind of rough, I wasn't expecting a review on it so soon but thank you for looking at it :)

— Reply to this email directly or view it on GitHub.

marcoscaceres commented 8 years ago

@emtwo, we should rename "DirectoryLinksProvider" to "AdvertTileProvider", or some such, to better reflect the name of what this thing actually does. During the meeting in NOLA, we reached consensus to be more transparent with the naming of things - particularly with components that directly relate to adding advertisement to the new tab page.

oyiptong commented 8 years ago

@emtwo can you infer the release and locale from the URL? If the URL doesn't parse, you can use nightly / en-US.

That way, we don't need to send that information from chrome.

An expected window.location URL structure would be:

https://newtab.cdn.mozilla.net/v1/nightly/en-US/index.html

You can use a regex with grouping to do so. where the version would be given as this regex: /v[0-9]+/

oyiptong commented 8 years ago

@marcoscaceres how are we doing the caching for the directory/suggested link imageURI and enhancedImageURI?

Will it be via a fetch?

oyiptong commented 8 years ago

Thinking about using SRI when it's built in fetch: https://bugzilla.mozilla.org/show_bug.cgi?id=1187335

marcoscaceres commented 8 years ago

@marcoscaceres how are we doing the caching for the directory/suggested link imageURI and enhancedImageURI? Will it be via a fetch?

Right now, requests are passed to the network and stored in the cache. We still need to build a bit of logic to cache the roll-overs, but it's pretty simple.

If we just treat everything as normal HTTP requests with appropriate cache directives (from images or whatever), then it should just work.

About SRI, will be simple to add later.

oyiptong commented 8 years ago

So, we'd intercept in the SW and store in cache, correct? Excellent. Thank you!

marcoscaceres commented 8 years ago

So, we'd intercept in the SW and store in cache, correct?

More than likely, yes. However, it really depends on how we design the Ads provider and how we then get the tiles to show up (e.g., when we update the grid, the request go to the service worker). I still don't have a complete picture of how the whole ads system works, so it's hard for me to comment.

My recommendation would be to not worry too much about the caching side, as 99.9% of the time people are going to be online and the ads are useless if the user is offline (clicking them just gives you a 404).

To be clear: design the thing to work as if the user was always online and the SW does not exist, and we add the caching bits as :cherries: on-top.

oyiptong commented 8 years ago

Hmm. I would disagree with some of your points. However, you are right. The lifetime of these images are very brief.

I was just asking about fetch so we have an idea if SRI will work, since we load from CSS and SRI is only supported either on the script, link elements and fetch.

marcoscaceres commented 8 years ago

Sent from my iPhone

On Nov 3, 2015, at 5:29 PM, Olivier Yiptong notifications@github.com wrote:

Hmm. I would disagree with some of your points. However, you are right. The lifetime of these images are very brief.

I was just asking about fetch so we have an idea if SRI will work, since we load from CSS and SRI is only supported either on the script, link elements and fetch.

So, it seems that we would only need SRI on the JSON file that contains the links. The css aspect actually gets set dynamically directly on the div elements.

— Reply to this email directly or view it on GitHub.

marcoscaceres commented 8 years ago

Ok, back at my computer... I shouldn't respond from my phone, as I make less sense than usual :).

So, we can totally set SRI on CSS files too if we want, as they are <link rel=stylesheet> - though it doesn't provide much, apart from assuring the integrity of the CSS file. The real value is in setting it for the JSON file, particularly if we don't trust the CDN (though I imagine that that is low risk, as we choose the CDN and it would be pretty legally scandalous if they substituted our encrypted content with their own).

@dougt, @oyiptong, can you help me understand a bit what we are actually trying to protect against? Is the CDN, right?

oyiptong commented 8 years ago

You are correct, we can set SRI on the JSON with the fetch. The CSS and scripts also get the treatment with the inline SRI attribute.

However, the images are also a valuable target for substitution. To ensure the integrity of the images, here's what I thought:

We could send the hash for each image in the directory links JSON.

When the service worker intercepts the HTTP query (presumably from the CSS background-url invocation), we'd call fetch on the URL with the SRI attribute.

The directly above is not ready yet, so we'd keep that for v2.

oyiptong commented 8 years ago

Note that we don't necessarily need to cache the directory/suggested images the same as we do regular assets.

Does the SW requests get cached my web cache too?

marcoscaceres commented 8 years ago

Sent from my iPhone

On Nov 3, 2015, at 11:25 PM, Olivier Yiptong notifications@github.com wrote:

Note that we don't necessarily need to cache the directory/suggested images the same as we do regular assets.

Does the SW requests get cached my web cache too?

Not sure. @wanderview can maybe answer?

— Reply to this email directly or view it on GitHub.

wanderview commented 8 years ago

I'm not 100% sure I understand the question, but fetch() calls from a SW still go through the http cache.

Also, note that we have not implemented SRI for fetch() yet:

https://bugzilla.mozilla.org/show_bug.cgi?id=1187335

Finally, in regards to intercepting sub-resources from stylesheets, you may want to be aware of this issue:

https://github.com/slightlyoff/ServiceWorker/issues/719

So I would not depend on intercepting sub-resources from cross-origin stylesheets loaded without CORS.

marcoscaceres commented 8 years ago

When the service worker intercepts the HTTP query (presumably from the CSS background-url invocation), we'd call fetch on the URL with the SRI attribute.

I'm hoping we will be able to drop using the background-image hack tho. We should be using responsive images there (i.e., img@srcset). Having said that, it would still be possible to do this with fetch, if we really need it (I still think SRI is overkill in this context, TBH - particularly for tiles and CDNs we have commercial/legal-binding contracts with... but happy to be convinced otherwise). However, I think if there is a strong need to have this, we should push for the SRI spec to add support via a HTML attribute.

The spec says that they will consider adding such a thing in the future (i.e., we should not manually implement this on top of fetch):

"A future revision of this specification is likely to include integrity support for all possible subresources, i.e., a, audio, embed, iframe, img, link, object, script, source, track, and video elements."

emtwo commented 8 years ago

r? @marcoscaceres

emtwo commented 8 years ago

@marcoscaceres I'll look into the check failures later, but the tests do pass locally for me, would you mind doing a preliminary review? Thanks!

emtwo commented 8 years ago

Note: need to incorporate version and locale from in url

marcoscaceres commented 8 years ago

Sorry, didn't manage to get to this today :( @emtwo, when are you working next? I'll make sure I have the review done by then.

Also, don't worry about the localization stuff. If all goes to plan, then localization should happen automagically (:pray:).

marcoscaceres commented 8 years ago

(actually, I need to see how the localizaton stuff is handled by the ads backend so maybe I'm being overly optimistic... but I'm hopeful).

emtwo commented 8 years ago

No worries @marcoscaceres, I'll be working tomorrow

emtwo commented 8 years ago

@marcoscaceres, I believe this is ready for a 2nd round of reviewing.

One note: I'm now passing a url to ProviderManager.init() as you suggested. However, for some reason the json files that I reference have parse errors unless I wrap the json with []. Which is why I do DirectoryLinksProvider._links = DirectoryLinksProvider._links[0]; in directoryLinksProvider.js.

I wasn't able to debug the issue there, but I'm hoping you have some idea about what I may have done wrong :)

marcoscaceres commented 8 years ago

Did a quick review - starting to take shape :+1:. Will try to do a more thorough review soon.

Once it's a bit more stable (and because it's my fault), I'd like to go in an change all the async() for async.task(f*(){},this);. That will keep the binding to "this" properly and avoids having to use the name of the object throughout. For example, https://github.com/mozilla/remote-newtab/pull/147/files#diff-e49da671c3e7eabeb352e09d1e8f733fR58.

k88hudson commented 8 years ago

As discussed, we closing this since we are pursing a different approach