mozilla / network-pulse-api

API for the Network-Pulse project
Mozilla Public License 2.0
12 stars 22 forks source link

Better handling of images #345

Closed alanmoo closed 5 years ago

alanmoo commented 6 years ago

Right now we happily accept whatever potentially huge images users upload to pulse, and return them back at that size for all users of the API. When you're loading a bunch of images on a page, like the pulse homepage or even the foundation site homepage, this can result in some rather large page requests.

Let's figure out a way to constrain the images- we can probably resize on upload, or use some sort of image resizing back end to support URL parameter based sizing (and caching of those sizes).

@xmatthewx is there an approach that would be better or worse from a functionality standpoint? @gideonthomas @cadecairos @patjouk any thoughts on what would be easiest to implement?

gideonthomas commented 6 years ago

I don't know about URL parameter based sizing but on uploading an image, I would say that the front-end should run the image through a resizing function and send that (base64 encoded) as an extra property in the payload along with the original image. We'd return links to both images when the front-end queries for an entry. I think it might also make sense to limit the size of the image on the front-end.

xmatthewx commented 6 years ago

is there an approach that would be better or worse from a functionality standpoint?

gideonthomas commented 6 years ago

Do we do the work client side or server side?

Performance-wise, it's debatable. Having the server do the resizing will hold up the request (unless we do something complicated like run the resizing logic in a separate thread and don't block the request on that). Having the client run the resizing logic will be faster, but you would have to deal with the large payload (that will include multiple sizes of the image) that will need to be uploaded to the server when posting the entry. My preference is still to do it client-side because instinctively I feel like it'll be faster. Don't know if this decision is affected by any security considerations with running resize code (which I think is equivalent to an eval on user data). @cadecairos might be able to advise.

Can we fail gracefully so a submission is not blocked if cpu or bandwidth is an issue?

That logic would need to be added in, but I don't think it'll be too hard.

cadecairos commented 6 years ago

I'd lean towards client side resizing. If we were to run these tasks server side, it would need to be done on a background task. Ideally it would be on a different dyno than one serving web requests, in order to prevent long running resize requests blocking others. Setting up something like that isn't impossible, but will be more work than using the client's computer.

alanmoo commented 6 years ago

@gideonthomas This feels like a thing that may fit within your current scope of work on improving Pulse. What do you think?

gideonthomas commented 6 years ago

I can but I'm not an expert on using JS to resize images client-side. Let me know if you want me to take this on.

alanmoo commented 6 years ago

Might it be worth it to evaluate Cloudinary? Their free tier should cover us for a while and I think would go a long way to improving our image performance (hopefully) without a ton of effort

gideonthomas commented 6 years ago

Also worth checking out: https://github.com/nodeca/pica which is what we use in Brackets.

alanmoo commented 6 years ago

Matthew and I chatted about this. Two guiding principles we should go with:

Pomax commented 6 years ago

counter point to the first conclusion: why? If we truly care, we don't let them upload huge images in the first place, and we tell them "look please make this image smaller, 6000x4000 pixels is too big". A 2000x2000 image has a small footprint, also sending a 1000x1000 and 500x500 version that's been created clientside off of the original( which is basically five lines of code) does not create an impossibly huge load on the person doing the image uploading.

Especially tying into the second point: if we already ask for a reasonably sized image, making their browser upload the smaller versions, too, is inverse-logarithmically more data. Instead of sending up 500kb of image, now they'll send 650kb. That's hardly a difference worth caring about.

alanmoo commented 6 years ago

It's not even images that large which are causing problems. This is a 50MB page, and the biggest image is about 1000x1000. But when we have a large amount of them displaying at once, 1.8MB an image adds up real quick. We can't expect users to resample images, as we want to encourage them to upload, not put roadblocks in their way.

The biggest point I can make though is that we don't actually know how we might use these images in the future (new layouts, hero banners, etc) so we should take what the users give us and do the work on our side.

@cadecairos Is this an option?

alanmoo commented 6 years ago

Based on the conversation in #378:

Talking with Taís and looking at Pulse & Foundation, it looks like we should start with at least these 2 sizes generated on the server:

Small: 700px width (max) Large: 1460px width (max)

(If possible we should have the server convert to JPEG as well)

xmatthewx commented 6 years ago

Do those sizes account for @2x? Seems like they do.

Let's add a 3rd tiny icon size. One use case would be to put tiny faces next to the names listed below a project. These should be cropped square. 96px? 144px? Maybe good as a multiple of 16, our base font size.

gideonthomas commented 6 years ago

What's the strategy we're taking with this? Cloudinary? AWS serverless formation?

alanmoo commented 6 years ago

I believe it's server resizing. @cadecairos can confirm, we talked about it for a bit last week.

patjouk commented 6 years ago

I've started working on this issue: I'll give cloudinary a try.

gideonthomas commented 5 years ago

@patjouk are you still working on this ticket? If not, we might want to experiment a bit here with libraries like django-imagekit or django-image-cropping instead of Cloudinary (both seem well used in django projects). We can then compare both results (since Cloudinary is used on the foundation site) and see which one works for us best.

Things to look out for - latency and memory usage

Long-term I think we should still put the pulse api, front-end, and foundation site behind a cache-based CDN like Cloudfront because that will definitely help with load time issues.

cadecairos commented 5 years ago

@patjouk Is going to look into how something like https://kraken.io/features could give us the image optimization processing we need, at a reasonable price, without an expensive CDN (like we get with Cloudinary).

patjouk commented 5 years ago

I'll open a ticket on mofo-devops to discuss image handling in one place: right now, it's spread out in too many tickets.

patjouk commented 5 years ago

Closing this in favor of MozillaFoundation/mofo-devops#636