What steps will reproduce the problem?
1. Emit large quantities of YAML using the default emitter:
YAML::Emitter emitter;
// emitter.SetStringFormat( YAML::Auto ); // this is the default setting
// tons of emitter output
2. Emit large quantities of YAML using an emitter set to DoubleQuoted strings:
YAML::Emitter emitter;
emitter.SetStringFormat( YAML::DoubleQuoted );
// tons of emitter output
3. Time the two and note that setting DoubleQuoted will improve performance by
a factor of 10.
What is the expected output? What do you see instead?
We expect the performance of the two modes to be competitive.
What version of the product are you using? On what operating system?
yaml-cpp version 0.3.0. Tested on MSVC11 (optimized and debug builds).
Please provide any additional information below.
When outputting a 10000 line file, YAML::Auto string format takes approx 1000
milliseconds, YAML::DoubleQuoted takes only 100 milliseconds.
A quick dig points to the creation of the 'disallowed' RegEx within
emitterutils.cpp:IsValidPlainScalar(). Currently it hits the heap repeatedly on
each call due to operator ||'s internal use of std::vector. To speed this up I
changed the code to:
static const RegEx& disallowed_flow = Exp::EndScalarInFlow()
|| (Exp::BlankOrBreak() + Exp::Comment())
|| Exp::NotPrintable()
|| Exp::Utf8_ByteOrderMark()
|| Exp::Break()
|| Exp::Tab();
static const RegEx& disallowed_block = Exp::EndScalar()
|| (Exp::BlankOrBreak() + Exp::Comment())
|| Exp::NotPrintable()
|| Exp::Utf8_ByteOrderMark()
|| Exp::Break()
|| Exp::Tab();
const RegEx& disallowed = inFlow ? disallowed_flow : disallowed_block;
After applying this change the speed difference is reduced down to about 50%.
Original issue reported on code.google.com by hoe...@gmail.com on 2 Oct 2012 at 12:49
Original issue reported on code.google.com by
hoe...@gmail.com
on 2 Oct 2012 at 12:49