JuliaCloud / XMLDict.jl

XMLDict implements a simple Associative interface for XML documents.
Other
33 stars 11 forks source link

Free xml doc #1

Closed texasrocks closed 8 years ago

texasrocks commented 8 years ago

For large documents, it may be worth calling free(xml). I've been experimenting with this and notice the memory usage steadily increasing with larger files.

samoconnor commented 8 years ago

Thanks for bringing that to my attention @texasrocks.

I'm somewhat surprised. The LightXML.jl doc says, under the heading "Create an XML Document", that "When you create XML documents and elements directly you need to take care not to leak memory". However, under the heading "Read an XML file" there is no mention of memory management. XMLDict.jl only calls LightXML.parse_string, so it does not "create XML documents and elements directly".

I just found this https://github.com/JuliaLang/LightXML.jl/pull/19 PR where @robertfeldt suggests that every new XMLDocument should have finalizer(xmldoc, free).

Would you mind trying the following change with your larger XML files and let me know if it helps? ...

Replace XMLDict.jl line 54 with:

function parse_xml(xml::AbstractString) 
    doc = LightXML.parse_string(xml)
    finalizer(doc, LightXML.free)
    return wrap(doc)
end
samoconnor commented 8 years ago

@texasrocks, I've pushed a change that adds a finaliser to the result of LightXML.parse_string: https://github.com/samoconnor/XMLDict.jl/commit/146af78e7c28e1cbbbd79c1a8794a6c0d4b7fd18

Can you let me know if this helps?

See also https://github.com/JuliaLang/LightXML.jl/issues/45

texasrocks commented 8 years ago

That did the trick! I just parsed a 6 gb file and the memory usage was constant. Thank you - this is great.

Reading the LightXML issue though, I'm surprised as well. I'm just reading XML files, not creating them. However, I used LightXML awhile back for another project and had a similar issue. Calling free(xdoc) solved the problem though. That's why I suggested it here. Regardless, it seems to be resolved.

Thanks again.

Brandon