iamcal / oembed

The oEmbed Spec
http://oembed.com
MIT License
1.32k stars 651 forks source link

Provider registry contains API endpoints with patterns not described in the spec #678

Closed mrogaski closed 1 year ago

mrogaski commented 1 year ago

Some of the API endpoints in the provider registry deviate from the published specification.

    {
        "provider_name": "MediaLab",
        "provider_url": "https://www.medialab.co/",
        "endpoints": [
            {
                "schemes": [
                    "https://*.medialab.app/share/watch/*",
                    "https://*.medialab.co/share/watch/*",
                    "https://*.medialab.app/share/social/*",
                    "https://*.medialab.co/share/social/*",
                    "https://*.medialab.app/share/embed/*",
                    "https://*.medialab.co/share/embed/*"
                ],
                "url": "https://*.medialab.(co|app)/api/oembed/",
                "discovery": true
            }
        ]
    },

Here, the endpoint URL contains a wildcard character and a regex-like grouped alternative. This is interesting because the spec makes no mention of using patterns in the API endpoint of the configuration.

It's not clear what the behavior should be. Should the host portion be set to the host matching the URL scheme? If a URL matches a scheme, but the pattern does not satisfy the pattern in the API endpoint, should an error be raised?

Thank you.

iamcal commented 1 year ago

Thank you for spotting this. There are 8 different providers that have the same issue:

[url] => http://play.adpaths.com/oembed/*
[url] => https://codehs.com/api/sharedprogram/*/oembed/
[url] => https://www.ora.tv/oembed/*?format={format}
[url] => https://*.slateapp.com/api/v2/oembed
[url] => https://*.didacte.com/cards/oembed
[url] => http://*.wiredrive.com/present-oembed/
[url] => https://*.medialab.(co|app)/api/oembed/
[url] => https://*.pitchhub.com.com/en/public/oembed

I'm not even sure what the intention is for some of these - they don't match the spec and use wildcards in different ways.

I'm going to try and repair them all as best I can.

mrogaski commented 1 year ago

I interpret the https://www.ora.tv/oembed/*?format={format} pattern to be a way to accomplish this statement in 2.2.

Note: Providers may choose to have the format specified as part of the endpoint URL itself, rather than as a query string parameter.

And the semantics for that could be easily clarified, but that should be explicit in the spec.

The wildcards would be complex to describe behavior for, but I suspect they all address a desire for a relative endpoint. That might be satisfied for most by extending the spec to allow a path-only URI to be given for a scheme which would use the host that the scheme matches.

The wildcard in the endpoint path strikes me as a bit more problematic, I would be cautions about allowing relative paths.

iamcal commented 1 year ago

The first 3 were actually easy - they don't need the wildcard at all (according to their example URLs). I'm left with 3 providers that want to do subdomain matching in the API URL, something there's no provision for in the catalog spec. Given that this has almost certainly never actually worked, I'm going to suspend/remove these providers for now.

iamcal commented 1 year ago

Landed in #679