supaphon / ticpp

Automatically exported from code.google.com/p/ticpp
0 stars 0 forks source link

Stack overflow with large XML Documents in VS2008 #12

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Create ticpp::Document with ~30000 elements with subelements
2. wait for the destructor to call void DeleteSpawnedWrappers()
3.

What is the expected output? What do you see instead?
expected: no problems
instead:
stack overflow, apparently deletespawnedwrappers is called recursively?

What version of the product are you using? On what operating system?
SVN commit 81. Visual Studio 2008

Please provide any additional information below.
my Xml document is composed of several thousand elements (~20k) with each 
subnodes (problem exists with both subelements and attributes). Resulting 
XML document is 6mb of size after dump to disk using ostream. Reading the 
same file with istream results in 80mb memory usage (some real stream 
based nodereader is appreciated). It still properly reads the xml 
document, but on destruction VS2008 crashes with a stack overflow.

Original issue reported on code.google.com by tornh...@gmail.com on 5 Jan 2008 at 7:59

GoogleCodeExporter commented 8 years ago
We have exactly the same problem with revision 86 and VS2003

A reasonable large xml file that is flat, three levels, the second level has 
several
thousand elements. It looks like this:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<DRUID-INGREDIENTS version="1.0">
    <Header copyright="">
        <Application version="1.0.0" name="" />
    </Header>
    <AppTemplates>
        <DruidElement name="TDomain" id="1001" parent_id="1000" template_id="8" />
[...snip...]
        <DruidElement name="Name" id="638176" parent_id="1001" template_id="10" />
    </AppTemplates>
</DRUID-INGREDIENTS>

Here is my test code:
std::auto_ptr<ticpp::Document> load(const char *filename)
{
  std::auto_ptr<ticpp::Document> xmlDruidStructure(new ticpp::Document(filename));
  xmlDruidStructure->LoadFile();
  return xmlDruidStructure;
}

void test()
{
  std::auto_ptr<ticpp::Document> xmlDruidStructure = load("myfile.xml");
  ticpp::Iterator<ticpp::Element> elementIngredient =
xmlDruidStructure->FirstChildElement("DRUID-INGREDIENTS");
  ticpp::Iterator<ticpp::Node> elementAppTemplates =
elementIngredient->FirstChildElement("AppTemplates");
  ticpp::Iterator<ticpp::Node> elementDruidNode;
  for (elementDruidNode = elementAppTemplates->FirstChild(false); elementDruidNode !=
elementDruidNode.end(); 
    ++elementDruidNode)
  {
  }

  xmlDruidStructure->SaveFile("x.xml");
}

For us the problem has a *high* priority as I need to delete the structure to 
find
any memory errors before the release.

Thanks

Original comment by duncan-g...@linuxowl.com on 20 Feb 2008 at 2:17

GoogleCodeExporter commented 8 years ago
I have similar problems, and I think I have localized the issue. I use

for (ticpp::Element*
elts=doc.FirstChildElement(false);elts!=0;elts=elts->NextSiblingElement(false)) 
{
   ...
}

to traverse through the XML file. This way, the siblings chain up and on 
destruction
are deleted recursively via DeleteSpawnedWrappers.

The problem is the NextSiblingElement creating its helpers as spawns of the 
current
element, instead of as spawns of the root element.

Element* Node::NextSiblingElement( const char* value, bool throwIfNoSiblings ) 
const
{
    TiXmlElement* sibling;
    if ( 0 == strlen( value ) )
    {
        sibling = GetTiXmlPointer()->NextSiblingElement();
    }
    else
    {
        sibling = GetTiXmlPointer()->NextSiblingElement( value );
    }

    if ( 0 == sibling )
    {
        if ( throwIfNoSiblings )
        {
            TICPPTHROW( "No Element Siblings found with value, '" << value << "', After this
Node (" << Value() << ")" )
        }
        else
        {
            return 0;
        }
    }

    Element* temp = new Element( sibling );
    m_spawnedWrappers.push_back( temp ); //this is the problem

    return temp;
}

I changed my for block to

ticpp::Iterator< ticpp::Element >e;
for ( e = e.begin( &doc); e != e.end(); e++ )
{ ... }

from the tutorial and now: doesn't work either. Their own wrapper uses a similar
construction. Argh. 

Original comment by xrxi...@newmail.ru on 16 May 2008 at 8:14

GoogleCodeExporter commented 8 years ago
Okay, I patched the Iterator<T> class to manage the lifespan of the wrapper 
objects
itself and ran it on my huge files in the debugger. It seems to work, at least 
there
are no major leaks (50 times repeated save/clear/load of the file, no measurable
increase of memory used). Those hacks are rather dirty, although they should be
robust and work. The pointer-variant

for (ticpp::Element*
elts=doc.FirstChildElement(false);elts!=0;elts=elts->NextSiblingElement(false)) 
{
   ...
}

will still cause trouble, but at least it does no more crash since I changed the
recursion in DeleteSpawned... to an iteration. (For huge files, this will still 
be
very slow!!)

Original comment by xrxi...@newmail.ru on 17 May 2008 at 8:54

Attachments:

GoogleCodeExporter commented 8 years ago
thank you for your effort, I have not had the time to investigate this.
i will try to analyze your patch soon.

Original comment by rjmy...@gmail.com on 17 May 2008 at 5:29

GoogleCodeExporter commented 8 years ago

Original comment by rjmy...@gmail.com on 21 May 2008 at 12:37

GoogleCodeExporter commented 8 years ago
See r94.
Sorry this took so long.
I would appreciate testing from all interested parties.
I attempted to duplicate the test posted by duncan-g...@linuxowl.com, with good 
results.

Original comment by rjmy...@gmail.com on 17 Jul 2008 at 3:03