lexborisov / myhtml

Fast C/C++ HTML 5 Parser. Using threads.
GNU Lesser General Public License v2.1
1.66k stars 147 forks source link

Ownership of pointers #165

Closed PopFlamingo closed 5 years ago

PopFlamingo commented 5 years ago

Hello, I just wanted to make sure that the only pointers that I need to deallocate are the ones I created explicitly through a call to a myhtml_*_create function?

Thank you

PopFlamingo commented 5 years ago

For instance do you ever need to free yourself a pointer returned by one of the functions (other than the create ones) ?

lexborisov commented 5 years ago

Hi @adtrevor Yes, all myhtml_*_create functions have myhtml_*_destroy functions.

lexborisov commented 5 years ago

@adtrevor In lexbor (my new project) you can read how to work with objects: https://lexbor.com/docs/lexbor/#objects (In MyHTML principle is the same.)

PopFlamingo commented 5 years ago

Thank you! I did notice from the examples that an html collection, even when you didn’t init it yourself, needs to be manually deallocated . Is this the only case of type returned by MyHTML functions that you need to deallocate yourself?

lexborisov commented 5 years ago

@adtrevor This is the only case when you need to destroy an object yourself.

PopFlamingo commented 5 years ago

@lexborisov Thank you! One last question: is there a way to compare the identity of two nodes? Can I simply compare at any point of time the memory address? Or is there a dedicated method for that? Thanks! :)

lexborisov commented 5 years ago

@adtrevor

if(first_node->tag_id == two_node->tag_id && first_node->ns == two_node->ns) {
    /*  */
}

Comparison of the address gives only one understanding: the same object or not.

PopFlamingo commented 5 years ago

@lexborisov Thank you!

PopFlamingo commented 5 years ago

@lexborisov Looks like I was unable to compare node identity with:

if(first_node->tag_id == two_node->tag_id && first_node->ns == two_node->ns) {
    /*  */
}

Wouldn't this rather compare equality with respect to tag name + namespace? I actually wanted to do identity comparaison, for instance getting a unique ID for each node, is there a way to do this?

Thank you! :)

PopFlamingo commented 5 years ago

For instance, in:

<!DOCTYPE html>
    <html>
        <head>
            <title>Foo</title>
            <meta charset="utf-8">
        </head>
        <body>
            <p class="foo bar">Hey</p>
            <p>Cools</p>
            <p></p>
            <div></div>
            <div></div>
        </body>
    </html>

I would like to be able to differentiate between the two div nodes

PopFlamingo commented 5 years ago

After some tests it looks like using pointer address does work for what I intended to do, but is this a documented behaviour or an implementation detail? In other terms can I always safely use the pointer address for identity compare?

Azq2 commented 5 years ago

Yes, you can use this safely. I used same way in my module: https://github.com/Azq2/perl-html5-dom/blob/master/DOM.xs#L2185

All nodes in tree have unique address in memory. This addres not changes after editing node.

But what is purpose? If you want check like isSameNode - check pointers. If you want check like isEqualNode

Or slower, but simplier way - serialize two nodes to text using myhtml_serialization_tree_callback and compare result by strcmp.

PopFlamingo commented 5 years ago

@Azq2 Thanks a lot! I had been searching for something like isSameNode indeed! Great!