thephpleague / html-to-markdown

Convert HTML to Markdown with PHP
MIT License
1.77k stars 205 forks source link

Functionality to establish a list of allowed tags to keep its attributes #100

Closed IsraelSanabriaMx closed 6 years ago

IsraelSanabriaMx commented 8 years ago

Changes

Default Values

MrPetovan commented 6 years ago

Why did you remove so many test cases?

colinodell commented 6 years ago

Thank you for contributing this feature idea, and my sincerest apologies for taking so long to review it.

I like the idea of using configuration options to allow certain HTML tags and attributes to be preserved, but I'm not 100% convinced this is the best approach.

IMHO, I think we should provide two different options with similar semantics - let's call them whitelist (replacing what you suggested) and blacklist (replacing remove_nodes and possibly strip_tags) for now. Both would take an array of tag and attribute selectors like this:

array('address', 'a[href]', '[src]', 'meta[charset=utf-8]')

This would match:

I think this approach would give much better control as it would also allow us to match/keep/discard certain attributes only on certain elements, but of course it would also be more complex to implement.

So while I do appreciate the work you've put into this, I would prefer to see something a little more unified and powerful. Thank you though!