ndmitchell / hexml

A bad XML parser
Other
19 stars 4 forks source link

Addition of unsafe calls #10

Closed dbeacham closed 7 years ago

dbeacham commented 7 years ago

When processing a large XML document with many attributeBy and children calls I see a massive speed improvement if I import the FFI calls as unsafe.

I don't know if these changes would want to be made directly to the existing FFI imports or if unsafe variants should be introduced but it would be a useful addition to the library.

ndmitchell commented 7 years ago

What invariants do unsafe FFI calls have to obey? Reading around, I think all the functions could be marked unsafe as I never call back into Haskell - is that your understanding? If so, I'd welcome a patch to add unsafe to the standard ones (or will do it myself given a bit more time).

ndmitchell commented 7 years ago

Also, when you say "massive" are we talking 10% or 10x? I think it's a good idea even if it's only 1%, but curious for my own knowledge.

dbeacham commented 7 years ago

It'll be very dependent on the XML you're running over but for my case I'm seeing

I think the basic guidance for adding unsafe is to make sure that they are small, pure, non-blocking functions.

This post and this github issue give a quick overview of where there can be complications but I don't think there are any such issues here.

I'll send over a pull request.

ndmitchell commented 7 years ago

My guess is we should add unsafe on:

hexml_document_error
hexml_document_node
hexml_node_children
hexml_node_attributes
hexml_node_child
hexml_node_attribute

But not:

hexml_document_parse
hexml_document_free
hexml_node_render

What do you think?

dbeacham commented 7 years ago

I agree. Although I suspect

hexml_document_error
hexml_document_node

won't see much benefit as they're only called once per document and so the overheads of safe calls don't add up.

ndmitchell commented 7 years ago

Yep, sounds right.

ndmitchell commented 7 years ago

Thanks! I'll make a release tonight.

ndmitchell commented 7 years ago

Released as 0.3.2. Thanks once again!