certik / yaml-cpp

Automatically exported from code.google.com/p/yaml-cpp
MIT License
0 stars 0 forks source link

Performance issue using Auto string formatting #172

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

1. Emit large quantities of YAML using the default emitter:
 YAML::Emitter emitter;
 // emitter.SetStringFormat( YAML::Auto ); // this is the default setting
 // tons of emitter output

2. Emit large quantities of YAML using an emitter set to DoubleQuoted strings:
 YAML::Emitter emitter;
 emitter.SetStringFormat( YAML::DoubleQuoted );
 // tons of emitter output

3. Time the two and note that setting DoubleQuoted will improve performance by 
a factor of 10.

What is the expected output? What do you see instead?
We expect the performance of the two modes to be competitive.

What version of the product are you using? On what operating system?
yaml-cpp version 0.3.0. Tested on MSVC11 (optimized and debug builds).

Please provide any additional information below.
When outputting a 10000 line file, YAML::Auto string format takes approx 1000 
milliseconds, YAML::DoubleQuoted takes only 100 milliseconds.

A quick dig points to the creation of the 'disallowed' RegEx within 
emitterutils.cpp:IsValidPlainScalar(). Currently it hits the heap repeatedly on 
each call due to operator ||'s internal use of std::vector. To speed this up I 
changed the code to:

static const RegEx& disallowed_flow = Exp::EndScalarInFlow()
    || (Exp::BlankOrBreak() + Exp::Comment())
    || Exp::NotPrintable()
    || Exp::Utf8_ByteOrderMark()
    || Exp::Break()
    || Exp::Tab();

static const RegEx& disallowed_block = Exp::EndScalar()
    || (Exp::BlankOrBreak() + Exp::Comment())
    || Exp::NotPrintable()
    || Exp::Utf8_ByteOrderMark()
    || Exp::Break()
    || Exp::Tab();

const RegEx& disallowed = inFlow ? disallowed_flow : disallowed_block;

After applying this change the speed difference is reduced down to about 50%.

Original issue reported on code.google.com by hoe...@gmail.com on 2 Oct 2012 at 12:49

GoogleCodeExporter commented 9 years ago
Done, ra87d35a23be90e17f0b6ef55e56f54cf2bc6a690.

Original comment by jbe...@gmail.com on 24 Jan 2015 at 10:31