lexborisov / myhtml

Fast C/C++ HTML 5 Parser. Using threads.
GNU Lesser General Public License v2.1
1.66k stars 147 forks source link

CDATA hang #156

Closed weasel009 closed 6 years ago

weasel009 commented 6 years ago

Hi Alexander,

It appears that the tokenizer hangs if <![CDATA[ shows up in HTML. tokenizer.c:

    // CDATA sections can only be used in foreign content (MathML or SVG)
    if(strncmp(tagname, "[CDATA[", 7) == 0) {
        if(tree->current_qnode->prev && tree->current_qnode->prev->args)
        {
            myhtml_tree_wait_for_last_done_token(tree, tree->current_qnode->prev->args);
            myhtml_tree_node_t *adjusted_current_node = myhtml_tree_adjusted_current_node(tree);

I attach a sample file to reproduce the problem. fourmilab-ch-babbage-pascal.txt

Regards!

lexborisov commented 6 years ago

Hello @weasel009 ! I understood what the problem is, tomorrow I'll try to fix it. Thanks!

lexborisov commented 6 years ago

@weasel009 Fixed in c97bfba commit.