thephpleague / html-to-markdown

Convert HTML to Markdown with PHP
MIT License
1.77k stars 205 forks source link

Final rendering may strip content #63

Closed marijnvdwerf closed 8 years ago

marijnvdwerf commented 8 years ago

Since the convertor uses DOMDocument internally, output needs to be sanitised. This happens in HtmlConvertor::sanitize. Unfortunately, this step may also strip content, as is shown in the following example.

Input:

<pre><code>...
&lt;script type = "text/javascript"&gt;
function startTimer() {
   var tim = window.setTimeout("hideMessage()", 5000)
}
</head>
<body>
...</pre></code>

Actual

    ...
    <script type = "text/javascript">
    function startTimer() {
       var tim = window.setTimeout("hideMessage()", 5000)
    }

    ...

Expected

    ...
    <script type = "text/javascript">
    function startTimer() {
       var tim = window.setTimeout("hideMessage()", 5000)
    }
    </head>
    </body>
    ...
colinodell commented 8 years ago

Fixed via #101