buttondown / roadmap

Buttondown's public roadmap
53 stars 0 forks source link

Store images pulled from iFramely in our own CDN #3450

Open catdevnull opened 2 weeks ago

catdevnull commented 2 weeks ago

Screenshot 2024-09-13 at 6 59 25 PM

catdevnull commented 2 weeks ago

Now TikTok embeds aren't loading at all in emails :/ even in the marketing page. Not sure what regressed. Looking into it

It seems that it fixed itself after a few hours 🤷 maybe an intermittent issue with iframely

catdevnull commented 2 weeks ago

It seems that TikTok's thumbnail URLs are IP-specific or something similar. Example: https://p16-sign-va.tiktokcdn.com/obj/tos-maliva-p-0068/82eb60c8e35e45cc9e8208c254cc244e_1680895330?x-expires=1713117600&x-signature=oYnzq5DmewcFor2woi5HcEu7JRs%3D

Screenshot 2024-09-16 at 6 52 31 PM

I don't think there's a good way to solve this; we can probably scrap TikTok ourselves, but I believe that's probably a stretch? But otherwise we need to hide TikTok thumbnails from emails, which sucks.

unrelated: hotlinking

Apparently we are doing something that iframely doesn't recommend

A preview image, usually smaller size, but not guaranteed. We recommend that you do not hot-link third party images on your site. There’s Camo, for example.

https://iframely.com/docs/links

Even then, if we put a reverse proxy for it, it probably won't work, because the URL is likely locked to iframely's server's IPs.

catdevnull commented 2 weeks ago

Took a quick look into scraping. I don't know how stable this would be, but it seems fairly trivial:

  1. GET the HTML for the page
  2. Parse the JSON inside script#__UNIVERSAL_DATA_FOR_REHYDRATION__
  3. parsed['__DEFAULT_SCOPE__']['webapp.video-detail']['itemInfo']['itemStruct'].video.cover is the URL

But then we would have to upload the cover somewhere.. and cache it.. 🤔

@jmduke would like your input in this

jmduke commented 2 weeks ago

Huh, fascinating. So for you https://buttondown.com/features/integrations/tiktok does not currently work? (It does on my end, which points to, as you say, some sort of IP/geo-based perms which makes sense because we're hitting a CDN.)

In that case, I think we can combine this issue with the hotlinking guidance and just re-upload thumbnails to our own S3 bucket.

catdevnull commented 2 weeks ago

Well, it is working now, lol... but yesterday, it looked like the first post.

Maybe, instead, what's happening is that iframely is caching a thumbnail url that expires (but isn't IP-locked) and eventually refreshes it. That might be more likely. Either way, the un-hotlinking stuff should help prevent it in most cases.

Should I do the reuploading stuff? I'm not familiar with how we do these kinds of things, especially because we really want to deduplicate stuff (and ideally store where we got it from.)

jmduke commented 2 weeks ago

Yup, that all tracks.

I'd say feel free to leave it for now. I'll retitle this issue to reflect the scope creep and think a bit about the right way to approach it