coleifer / micawber

a small library for extracting rich content from urls
http://micawber.readthedocs.org/
MIT License
632 stars 91 forks source link

Option to convert only single-line links #96

Closed loleg closed 3 years ago

loleg commented 3 years ago

The solution in issue #29 does not fix the problem of having provider links inside of markup (or Markdown, for that matter) somewhere within the line. The resulting code is still mangled, in particular when micawber is combined with a Markdown renderer (Misaka).

In this custom filter I took parts of parsers.py, essentially removing the else: line = parse_text_full .. block, which I think should be optional.

In the linked commit I also add my own line-parser for performance reasons, but I'll soon learn to register my own provider and fix that ;-)

coleifer commented 3 years ago

When you're using markdown, then it is strongly recommended to convert the markdown to HTML first, then use oembed afterwards on the markup (with "html=true").

loleg commented 3 years ago

@coleifer that is what I tried first, but I had even more issues as with this strategy. Thanks for having considered the proposal, but I am going to stick to my text-first approach :) Perhaps having a micawber+markdown example somewhere could help me and others?

coleifer commented 3 years ago

but I had even more issues as with this strategy

Can you be specific? I've used markdown-text -> markdown -> micawber on many different projects and have not run into any issues.

If you're trying to only match links on their own line, then you're probably best off doing something similar to parsers.parse_text() and just skip the case where the line does not match a URL..

coleifer commented 3 years ago

I've gone ahead and made a small change to the parse_text() function such that, if block_handler=None then inline links will not be handled. So you can force micawber to only handle standalone links by using parse_text(block_handler=None).