coleifer / micawber

a small library for extracting rich content from urls
http://micawber.readthedocs.org/
MIT License
632 stars 91 forks source link

Allow compositing Providers #34

Closed jnovinger closed 9 years ago

jnovinger commented 10 years ago

We need to grab oembed data and would prefer to do it ourselves using the providers in micawber.providers.bootstrap_basic.

There are some services that don't provide endpoints (thinking of Facebook and Vine, in particular) or aren't defined in bootstrap_basic. We want to compose a ProviderRegistry instance which tries providers from bootstrap_basic first, falling back to oembedio or Embedly if nothing is found.

Our current (proposed) solution:

from micawber import bootstrap_basic, bootstrap_embedio

# embedio first so that basic providers overwrite embedio providers
# a bit icky since it relies on internal registry implementation
providers = bootstrap_embedio()
for provider in boostrap_basic():
    providers.register(provider)

That seems a bit ... circuitous. So, here a couple of ways to provide composited ProverRegistrys that I can think of:

1) use our proposed solution above, and note it in the docs, 2) allow the various bootstrap_* funcs to take an optional registry argument that defaults to None, but is used if passed,

def bootstrap_basic(pr=None, cache=None):
    pr = pr or ProviderRegistry(cache)
    ...
    return pr

3) Extract the hard coded endpoints in bootstrap_basic so that they're available to use by library users.

PROVIDERS = {
    'http://blip.tv/\S+': 'http://blip.tv/oembed',
    ...
}

def bootstrap_basic(cache=None)
    pr = ProviderRegistry(cache)

    for regex, endpoint in PROVIDERS.items():
        pr.register(regex, Provider(endpoint))

    return pr

Thoughts?

coleifer commented 9 years ago

I implemented the second option you suggested. Because the registry is unordered (dict), it is ambiguous which provider might be used if multiple regex matches exist. In a subsequent commit I may change the dict to a list to ensure predictable ordering.