Open matuszeman opened 9 years ago
I like that !
Interesting. What would the use case for that feature be? Also, if we do it, say, for YouTube videos, what would should be returned for YouTube user profile pages?
... We have canonical URL in the response, I trust most people agreed to use it as identifier of the resource on the web, no?
Yes, canonical URLs seemed to be just what I needed but as I think about this feature more, it could be probably renamed to "Providing unique identifier of PRIMARY content relative to a site".
Example: https://www.youtube.com/watch?v=XOmwZopzcTA - represents a video page https://www.youtube.com/watch?v=XOmwZopzcTA&list=PL53194065BA276ACA&index=9 - represents exactly same video page in a playlist. I understand that both URLs above are just right as canonical URLs. Latter one is video in context of a playlist.
My use case is: User provides an URL, my app should be able to check if "primary content" reference does exist in my DB or not. Because of this, my idea was to use pair: site name and unique ID relative to that site.
We tried to find a better answer to this use case for 3 years. It pops up every couple months in one form or another. No luck so far.
Here's to show you the problem. Even for YouTube, if we give you ID for the video, it will be the same for two URLs: ?v=... and ?v=...&t=... - a timed embed, which would have a different embed code. For Google Maps it would be zoom levels, etc.
That just shows you can not trust IDs. And even canonical addresses, as actual URL context is essential for embeds. Besides, for short links (say, Bitly), it will be faster if you just let Iframely complete the processing than returning a re-direct to your app. Facebook does cache by og:url
or canonical, but it comes at a cost of slower processing times.
We ended up making a decision that caching by exact URL will cover 99% of our use cases, and that it is good enough for us. At least for now.
That's actually what I'm after ... I want to be able to identify what content (uniquely identified per site) users share according an URL they provide. In my use case I don't care about zoom level nor time information - it just about identifying the primary content itself what in case of youtube video could be video ID or any unique identifier for such entity on the site.
I'm new to iframely, but I checked https://github.com/itteco/iframely/blob/master/plugins/domains/youtube.com/youtube.video.js and it seems like it would be quite easy to provide this information from what we have already available there. Is there a documentation which I could use to learn more and maybe experiment a bit and contribute with a plugin?
For stats aggregation - I see your point. As for caching it doesn't make sense: you would still need to make a request to Iframely to get this ID.
Even for stats, canonical would be a better and more universal source. You could take a hash of it for better indexing. With "canonical" I mean meta.canonical
that is returned in Iframely JSON, or oembed.url
, as ideally it is the same for same video. Not the actual URL you send to APIs.
Now the problem with our YouTube plugin we have is that it doesn't give canonical address at all. We will be fixing it soon as well as making sure all other plugins give consistent response.
If you experiment with it in the meantime, you could check this unfinished doc on how to write plugins.
@matuszeman
You can add
getMeta: function(...) {
return {ID:'...'};
}
for any plugin.
And result data will contain 'ID' in 'meta' section of response from that plugin.
User story: As API user, I want to get unique ID of the content relative to a site.
Examples https://www.youtube.com/watch?v=XOmwZopzcTA&index=9&list=PL53194065BA276ACA ID: XOmwZopzcTA
https://soundcloud.com/jackedradio/afrojack-presents-jacked-radio-week-23 ID: 210140747
For a site where it's not possible to recognize system ID it would be based on a value from URL: https://soundcloud.com/jackedradio/afrojack-presents-jacked-radio-week-23 ID: jackedradio/afrojack-presents-jacked-radio-week-23
What do you think?