chobie / php-sundown

php-sundown is just simple wrapper of sundown
Other
137 stars 16 forks source link

Unable to filter html #34

Closed aleemb closed 10 years ago

aleemb commented 10 years ago

I tried the following but the HTML tags aren't being filtered.

$options = array("autolink"=>true, "tables"=>true, "filter_html"=>true);
$renderer = new \Sundown\Render\Base($options);
$md = new \Sundown\Markdown($renderer);

// outputs <strong>foo</strong>
$html = $md->render("<strong>foo</strong>");

Also tried another way but to no avail:

$md = new \Sundown\Markdown($renderer, ["filter_html"=>true]);

Am I missing something?

aleemb commented 10 years ago

I may have over-simplified the test case. The following is the test-case that doesn't filter the HTML:

// outputs <div style="font-weight: bold">testing</div>
$html = $md->render("<div style="font-weight: bold">testing</div>");
chobie commented 10 years ago

Hi, @aleemb

\Sundown\Render\Base does not support any options. in you case, please use \Sundown\Render\HTML or \Sundown\Render\XHTML.

<?php
$options = array("autolink"=>true, "tables"=>true, "filter_html"=>true);
$renderer = new \Sundown\Render\HTML($options);
$md = new \Sundown\Markdown($renderer);

// outputs <strong>foo</strong>
$html = $md->render("<strong>foo</strong>");
echo $html;
# this will outputs
# <p>foo</p>
aleemb commented 10 years ago

I ran into issues with the following case:

$html = $md->render('<div><b>hello</b></div>');
// <div><b>hello</b></div>

Basically, I don't want to allow any HTML. I'd like to be able to strip HTML and all content within. Right now I was able to enter much larger fragments of HTML (anything inside a block level element). For user-input I think it's best not to trust any HTML within it as most sites do. Currently I am doing this in preprocess using regex but am hoping there is a faster way since this is such a common use-case.

chobie commented 10 years ago

ah, sorry. I found a bug. let me fix it

aleemb commented 10 years ago

Also possibly related to this, I tried returning an empty string from blockHTML callback but input of <div>b>hello</b></div> produced output of hello. I would have expected the output to be empty. Not sure if this is the correct behavior or not for blockHTML but it appears to be working only only the tags but excludes the innerHTML.

I am very grateful to you for such an awesome project, thanks very much.

chobie commented 10 years ago

@aleemb I've made some mistakes regarding filter_html, no_images, and no_links render flags. I'll push sundown to PECL in this weekend as I want to add more test cases.