Memory optimization of call stack while selecting nodes using Xpath

GoogleCodeExporter commented 9 years ago

I use pugixml on a realtime target system and it works great...

Except the issue that when I use Xpath to select particular nodes from the 
loaded DOM a huge portion of the stack is utilized.

(I think its due to extensive use of recursive function calls and auto-declared 
objects in the xpath - method "select_nodes"

Since the system has only a limited stack resource its impossible to process 
complex files.

Here is a sample usage:

xpath_node_set RootComIf;
RootComIf = pArDom_->select_nodes("/IOCFG/Module[(@Hardware = 
'EPL-V1')or(@Hardware = 'EPL-V2')or((@Hardware = 'X2X') and 
(substring-after(substring-after(@ID,'IF'),'IF')=''))]");

pArDom_ (refers to a DOM that is already loaded with xml data)

Only for processing the above line the system stack usage peaked to 5412 bytes.

Is there a way to optimize it? 

P.S: The memory optimisation options for DOM objects 
PUGIXML_MEMORY_PAGE_SIZE
PUGIXML_MEMORY_XPATH_PAGE_SIZE
were really useful in keeping the heap memory consumption to minimum.

And I would like to know the use of PUGIXML_MEMORY_OUTPUT_STACK (will it help 
in reducing the stack usage while processing a xpath query?)

Thank you very much - this is really a wonderful parser that I would use in all 
my new applications!

Original issue reported on code.google.com by srilux...@gmail.com on 15 Nov 2012 at 6:22

GoogleCodeExporter commented 9 years ago

PUGIXML_MEMORY_OUTPUT_STACK is only used for node printing (i.e. 
xml_node::print() and xml_document::save*). XPath queries are not affected - 
they don't use printing.

As for XPath stack usage, I'll look into it. XPath evaluator does use recursive 
calls extensively; making it stackless (and instead allocating the expression 
stack on heap) is doable, but will require significant restructuring. Maybe 
there are easier ways...

Original comment by arseny.k...@gmail.com on 15 Nov 2012 at 6:34

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Quick followup questions:
1. What is the value of PUGIXML_MEMORY_XPATH_PAGE_SIZE for your setup?
2. Do you use PUGIXML_NO_EXCEPTIONS? If yes, what is `sizeof(jmp_buf)' on your 
platform?

Original comment by arseny.k...@gmail.com on 15 Nov 2012 at 6:42

GoogleCodeExporter commented 9 years ago

Answers:
1.I tried with different values for PUGIXML_MEMORY_XPATH_PAGE_SIZE (ranging 
from 256 to 32768) but there was no difference in stack usage. But the peak 
heap memory consumed during the execution of the mentioned line was less, when 
the page size was less - as it allocated multiple times smaller memory chunks 
instead of onetime huge memory)...

2.Yes I use PUGIXML_NO_EXCEPTIONS (As I dont want to disable exceptions from 
pugi) and the sizeof(jmp_buf) of my target was 36 bytes

Original comment by srilux...@gmail.com on 16 Nov 2012 at 8:49

GoogleCodeExporter commented 9 years ago

Correction:

Sorry my report that "different values for PUGIXML_MEMORY_XPATH_PAGE_SIZE 
didn't make difference in stack usage" is dead wrong!!!

Actually there was a small bug? in the "pugiconfig.hpp" that popped up when 
using the configuration header-only version - in which I tested the application.

// Uncomment this to switch to header-only version
 #define PUGIXML_HEADER_ONLY
 #include "pugixml.cpp"

// Tune these constants to adjust memory-related behavior
 #define PUGIXML_MEMORY_PAGE_SIZE 1024
 #define PUGIXML_MEMORY_OUTPUT_STACK 10240
 #define PUGIXML_MEMORY_XPATH_PAGE_SIZE 256

the above code in "pugiconfig.hpp"  resulted in negligence of the definitions 
memory tuner constants - as the pugixml.cpp is included before the definition. 
(in header only mode) - u could fix this.

Now with properly set value for the PUGIXML_MEMORY_XPATH_PAGE_SIZE define I 
observe the following:

*with value of 4091 (default) - the stack consumption for the execution of the 
above mentioned line of code was - 14194

*with minimum recommended value 256 - the stack consumption was 6564

so there is a considerable influence of the PUGIXML_MEMORY_XPATH_PAGE_SIZE on 
the stack usage.

Awaiting for any helpful suggestion...Thanks!

Original comment by srilux...@gmail.com on 16 Nov 2012 at 12:00

GoogleCodeExporter commented 9 years ago

You're right, the header include order is wrong. I've fixed this in trunk - 
thanks for the report!

The stack size is very much platform-dependent; in order to optimize further, 
I'll need to know where are the problematic spots for your platform... Is the 
compiler on your platform gcc? If yes, does your version of gcc support 
-fstack-usage? If yes, can you compile with this option and attach the stack 
size report?

I'm seeing different behavir with different versions of gcc and compilation 
options - for example, for some versions and compilation settings changing this:

    PUGI__FN xpath_string convert_number_to_string(double value, xpath_allocator* alloc)

to this:

    static PUGI__NO_INLINE xpath_string convert_number_to_string(double value, xpath_allocator* alloc)

reduces the stack usage by ~2k, and some versions do not decide to inline this 
function so it does not matter.

I fear that this might be the only easy fix, but the stack usage report might 
uncover something interesting.

Original comment by arseny.k...@gmail.com on 17 Nov 2012 at 5:32

GoogleCodeExporter commented 9 years ago

Quick update.
For gcc, xpath_stack_allocator_capture (which is I think pretty much the only 
type with a destructor in XPath evaluation code) seems to prevent stack space 
sharing between different branches of if and switch statements. Which is 
unexpected to say the least...

A quick test rewrite reduces the peak evaluation stack from 4796 bytes to 2796 
bytes; however, the peak query compilation stack is 3616 bytes, which is the 
new bottleneck. I've slightly tweaked the parsing code to bring it down to 3488 
bytes.

I'm attaching the modified version for now. Here's what I want to know:

1. What are your stack constraints?
2. Given that pugixml supports precompiled xpath queries (i.e. you can compile 
xpath_query object anywhere in your program and later use it), can you 
precompile the query at a point in the program where stack space is a lesser 
concern? (although there are a few shortcuts in the recursive descent that I 
probably should implement to get the stack usage down anyway...)

Original comment by arseny.k...@gmail.com on 17 Nov 2012 at 6:22

Attachments:

pugixml-1.2-xpath-opt1.zip

GoogleCodeExporter commented 9 years ago

The xpath_allocator_capture object turns out to only be a problem if you don't 
disable exceptions explicitly during code generation via -fno-exceptions flag.
With -fno-exceptions this problem does not exist.

I've considerably optimized the stack consumption of query parser - for the 
query above, compiling the query now takes less than 2k of stack size (it's 
1400-1800 bytes depending on the compilation options.

Query compilation & evaluation now takes 2.7 kb stack space for me with the 
current trunk and with -fno-exceptions compilation option (and of course with 
MEMORY_XPATH_PAGE_SIZE set to 256). I'm looking into whether I can reduce that 
to be below 2k - of course, different compiler options and platform can change 
things, so -fstack-usage report will still help.

Original comment by arseny.k...@gmail.com on 18 Nov 2012 at 5:15

GoogleCodeExporter commented 9 years ago

I think that the existing optimizations are enough for now; with exceptions 
being disabled and an appropriate MEMORY_XPATH_PAGE_SIZE setup, the stack 
consumption is relatively modest; it's hard to optimize it more without a 
considerable rewrite of XPath implementation. Therefore I'm marking this Fixed.

Original comment by arseny.k...@gmail.com on 10 Feb 2013 at 7:50

Changed state: Fixed

amorlzu / pugixml

Memory optimization of call stack while selecting nodes using Xpath #187