majintao0131 / yaml-cpp

Automatically exported from code.google.com/p/yaml-cpp
MIT License
0 stars 0 forks source link

YAML::Emitter fails to quote scalars ending in colon correctly #85

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Q: What steps will reproduce the problem?
Try emitting any scalars ending in colon ':'.  For example:

    // Test A
    YAML::Emitter emitter;
    emitter << "a:";
    std::cout << emitter.c_str();

    // Test B
    YAML::Emitter emitter;
    emitter << YAML::BeginMap
            << YAML::Key << "apple"
            << YAML::Value << ":"
            << YAML::Key << "banana"
            << YAML::Value << ":"
            << YAML::EndMap;
    std::cout << emitter.c_str();

Q: What is the expected output? What do you see instead?
You should expect the scalar in Test A and the values in Test B to be quoted:

  --- "a:"

  ---
  apple: ":"
  banana: ":"

Instead, the colons don't get quoted:

  --- a:

  ---
  apple: :
  banana: :

which then cause the Parser to treat the colon values as a map key-value 
delimiters and throw YAML::ParserException().

Q: What version of the product are you using? On what operating system?
0.2.5 on Debian Lenny

Q: Please provide any additional information below.
We debugged this and came up with this root cause & solution.  Please let us 
know if we made the correct solution:

In IsValidPlainScalar() in emitterutils.cpp, the disallowed regex contains the 
regex Exp::EndScalar() which is supposed to match any strings ending in ":" and 
then return false (so that the string gets quoted).

            bool IsValidPlainScalar(const std::string& str, bool inFlow, bool allowOnlyAscii) {

                ...

                // then check until something is disallowed
                const RegEx& disallowed = (inFlow ? Exp::EndScalarInFlow() : Exp::EndScalar())
                                          || (Exp::BlankOrBreak() + Exp::Comment())
                                          || Exp::NotPrintable()
                                          || Exp::Utf8_ByteOrderMark()
                                          || Exp::Break()
                                          || Exp::Tab();
                StringCharSource buffer(str.c_str(), str.size());
                while(buffer) {
                    if(disallowed.Matches(buffer))
                        return false;
                    if(allowOnlyAscii && (0x7F < static_cast<unsigned char>(buffer[0])))
                        return false;
                    ++buffer;
                }

                return true;
            }

Specifically, EndScalar() is constructed to match any string ending in 
":<space>" or just ":":

        inline const RegEx& EndScalar() {
            static const RegEx e = RegEx(':') + (BlankOrBreak() || RegEx());
            return e;
        }

The problem is that the left side of the RegEx would match the colon correctly 
but the right side, which is an REGEX_OR container, would return false in 
RegEx::IsValidSource() function, due to source being at the end of the string:

    template<>
    inline bool RegEx::IsValidSource<StringCharSource>(const StringCharSource&source) const
    {
        return source || m_op == REGEX_EMPTY;
    }

In other words, after the emitter reads the colon, it would fail due to the 
engine thinking that the input is no longer valid, even though it is "valid" 
enough to match the "empty" (aka end of string) regex .

The fix for this is to change IsValidSource() to the following:

    template<>
    inline bool RegEx::IsValidSource<StringCharSource>(const StringCharSource&source) const
    {
        switch(m_op) {
            case REGEX_MATCH:
            case REGEX_RANGE:
                return source;
            default:
                return true;
        }
    }

This makes sure that the source (input) is always valid for operator regex 
(OR/AND/NOT/SEQ), and that a source is only invalid if it's at eos and the 
regex is trying to match an actual character.

Original issue reported on code.google.com by atoms...@gmail.com on 2 Dec 2010 at 10:55

GoogleCodeExporter commented 9 years ago
I accidentally treated the dupe as the actual case, so I'm closing this one as 
a dupe :)

Original comment by jbe...@gmail.com on 3 Dec 2010 at 9:54