jbeder / yaml-cpp

A YAML parser and emitter in C++
MIT License
5.04k stars 1.81k forks source link

Maintaining order of Nodes on saving #169

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
AML::Node config = YAML::LoadFile("config.yaml");
config["lastLogin"] = getCurrentDateTime();
std::ofstream fout("config.yaml");
fout << config;

What is the expected output? What do you see instead?
It would be nice if we could maintain the order of Nodes in 
saving/output-streaming.

What version of the product are you using? On what operating system?
0.3.0 on Ubuntu 11.x

Please provide any additional information below.
While we save/output-stream the order of Nodes in original files were not 
maintained. It would be really nice if we could maintained original order 
because most of the time we also do manual edit on yaml files and grouping some 
variables together makes more sense.

Original issue reported on code.google.com by rudrapou...@gmail.com on 26 Jul 2012 at 11:33

GoogleCodeExporter commented 9 years ago
Issue 187 has been merged into this issue.

Original comment by jbe...@gmail.com on 29 Jan 2013 at 2:22

GoogleCodeExporter commented 9 years ago
This is sorta intentional. YAML doesn't specify the ordering of key/value pairs 
in a map, so you're not supposed to rely on it.

However, I do see some value to it, so I will consider it. I'm just not 
convinced yet :)

Original comment by jbe...@gmail.com on 29 Jan 2013 at 2:23

GoogleCodeExporter commented 9 years ago
The way I see it, yaml-cpp is a tool for parsing files. A file is a (ordered) 
sequence of information. This is especially true with YAML as it is meant to be 
streamable. Therefore a YAML parser should somehow keep this information.

Of course there is the issue of creating a non-standard behavior people might 
start relying on, and the issue of implementation overhead to keep the ordering 
while still allowing fast search for keys.

Your suggestion to turn it into a sequence of single-element maps is not a bad 
one. I'll consider it.

Original comment by oster.t...@gmail.com on 29 Jan 2013 at 7:04

GoogleCodeExporter commented 9 years ago
I appreciate that it's a potentially useful thing to do; but yaml-cpp is a tool 
to parse YAML, which does not specify the ordering of key/value pairs in a map. 
(A related situation: yaml-cpp also does not preserve stray whitespace. It is 
*not* a goal of the library to be able to write a file exactly as it read it.)

The fact that YAML is streamable is a red herring - the YAML spec *explicitly* 
says that key order is an implementation detail 
(http://www.yaml.org/spec/1.2/spec.html#id2765608).

Again, though, this is one of those things that I do see some value in, so I'll 
think about it. One problem is that some people want to maintain order and 
others want it to be ordered alphabetically, and I'm not a huge fan of 
proliferation of configuration details.

Original comment by jbe...@gmail.com on 29 Jan 2013 at 9:08

GoogleCodeExporter commented 9 years ago
It would be great to have order preserved, especially in configuration files. 
So one could provide some common options to be changed from UI and other leave 
to edit manually in configuration file.

Original comment by perch...@gmail.com on 24 Jan 2014 at 6:18

GoogleCodeExporter commented 9 years ago
Currently you can:
1. Save whole doc using Node::Clone
2. During save, check Node::GetMark().line

Original comment by end...@gmail.com on 24 Apr 2014 at 1:47

GoogleCodeExporter commented 9 years ago
Maybe you could offer a non-standard extension of some sort? That would make it 
clear that it's not part of the standard, yet allow you to provide useful 
behavior for those who want it.

Original comment by jwbdec...@gmail.com on 9 Jul 2014 at 7:45

GoogleCodeExporter commented 9 years ago
> Currently you can:
> 1. Save whole doc using Node::Clone
> 2. During save, check Node::GetMark().line

Could you please explain this solution in detail?

Original comment by gim6...@gmail.com on 19 Jul 2014 at 11:19

GoogleCodeExporter commented 9 years ago
There's already a struct Mark that tells you a location in a file.

You can change 
NodeBuilder::OnScalar/NodeBuilder::OnSequenceStart/NodeBuilder::OnMapStart by 
uncommenting "mark" from the parameters and telling "node" to save the "mark".

To tell the node to save the mark, you will need to add a new member variable 
"mark" to node_data and implement getters/setters up the chain (node_ref, 
detail::node, Node).

If you want Clone() to work, you will also need to change NodeEvents::Emit to 
pass in the actual mark instead of constructing an empty mark.

Original comment by henear...@gmail.com on 23 Jul 2014 at 5:19

GoogleCodeExporter commented 9 years ago
Does anyone know how it reorders the nodes? I can't seem to figure out a 
pattern.

Original comment by justin.j...@gmail.com on 19 Feb 2015 at 9:47

GoogleCodeExporter commented 9 years ago
For save nodes in determining order I use Yaml::Emitter. For example:

void Author::customToYaml(YAML::Emitter &out) const
{
    out << YAML::Key << "companyName" << YAML::Value << companyName();
    out << YAML::Key << "companySite" << YAML::Value << companySite();
}

Original comment by gil.il...@gmail.com on 19 Feb 2015 at 9:55

GoogleCodeExporter commented 9 years ago
@justin, #10: it's arbitrary-ish, based on pointer keys.

Original comment by jbe...@gmail.com on 19 Feb 2015 at 3:38

GoogleCodeExporter commented 9 years ago
I also would like more ordered output. I fully get that YAML is using unordered 
maps, but it's very weird when hand-editting YAML files, and each object is in 
a different order.

Example:

YAML::Node root;
for(const auto &pair : this->Details.MaterialDetails)
{
    Engine::MaterialID materialID = pair.first;
    const Engine::MaterialDetails &materialDetails = pair.second;

    YAML::Node material;
    material["MaterialID"] = materialID;
    material["DisplayName"] = materialDetails.displayName;
    material["SoundID"] = materialDetails.soundID;

    root["Materials"].push_back(material);
}

I find it really odd that iterating over elements in my map, each element of 
the map will have its members outputted arbitrarily from one another.

Materials:
  - DisplayName: Fallthrough
    SoundID: 0
    MaterialID: 2
  - SoundID: 0
    DisplayName: Default
    MaterialID: 1
  - MaterialID: 0
    DisplayName: None
    SoundID: 0

Each element is in a different order. I don't expect the elements in any 
particular order, but a consistent order would be nice. Maybe you can output 
mapped values by alphabetical (or numerical) order of the key, or some other 
arbitrary-but-consistent order?

Original comment by JaminThe...@gmail.com on 26 Feb 2015 at 9:35

GoogleCodeExporter commented 9 years ago
It's on the table to output a consistent ordering. It's not on the table to 
iterate in a consistent ordering.

Original comment by jbe...@gmail.com on 26 Feb 2015 at 9:49

GoogleCodeExporter commented 9 years ago
Yes, consistent output is what I'm asking for, not consistent iteration. The 
iteration of 'MaterialDetails' in my example above is an std::unordered_map not 
yaml-cpp, sorry for the confusion.

I was trying to demonstrate that outputting while in any kind of loop still 
produces unordered (but correct) output, despite using the exact same functions 
in the exact same order. Which you're aware of.

Original comment by JaminThe...@gmail.com on 27 Feb 2015 at 6:38

GoogleCodeExporter commented 9 years ago
Hmm, so I tried using the emitter to emit the key and value one by one to get 
the correct order. My code looks like this:

for (YAML::const_iterator it = baseNode.begin(); it != baseNode.end(); ++it) {
        YAML::Node key = it->first;
        YAML::Node value = it->second;

        if (value.IsScalar()) {

            emitter << YAML::Key << key.as< std::string >() << YAML::Value
                    << value.as< std::string >();

        } else {

            // Otherwise a map
            emitter << YAML::Key << key.as< std::string >();
            emitter << YAML::Value << YAML::BeginMap;
            SaveBaseNode(value, emitter);
            emitter << YAML::EndMap;

        }

    }

The output is still not ordered. I'm guessing there is no solution to this...?

Original comment by justin.j...@gmail.com on 2 Mar 2015 at 9:05

GoogleCodeExporter commented 9 years ago
I would be interested in this as well, since it has forced me to use the "old" 
Emitter API. In my mind, YAML should be easily readable, and arbitrary or 
inconsistent ordering hinders that.

That said, the old Emitter API does the trick.

Original comment by Skimm...@gmail.com on 11 Mar 2015 at 5:45

hersh commented 8 years ago

This is a high priority for me. I will have to figure out a workaround before I can proceed.

The config files read and written by yaml-cpp are committed to a git repo, and our company does code reviews of all changes. Because yaml-cpp writes maps in an unpredictable order, every time the config file is written, then entire file appears to change even if no content is different. This will make reviews unfeasible.

For my use case, I just need repeatable ordering, not controllable ordering. I'm imagining a new Emitter manipulator called SortedMaps or something.

hersh commented 8 years ago

Here is my workaround, in case anyone is interested. I call writeYamlOrderedMaps(out, node) instead of out << node.

#include <algorithm>
#include <string>
#include <vector>
#include <yaml-cpp/yaml.h>

// Recursive helper function that does all the work
void writeNode(const YAML::Node& node, YAML::Emitter& emitter)
{
  switch (node.Type())
  {
    case YAML::NodeType::Sequence:
    {
      emitter << YAML::BeginSeq;
      for (size_t i = 0; i < node.size(); i++)
      {
        writeNode(node[i], emitter);
      }
      emitter << YAML::EndSeq;
      break;
    }
    case YAML::NodeType::Map:
    {
      emitter << YAML::BeginMap;

      // First collect all the keys
      std::vector<std::string> keys(node.size());
      int key_it = 0;
      for (YAML::const_iterator it = node.begin(); it != node.end(); ++it)
      {
        keys[key_it++] = it->first.as<std::string>();
      }

      // Then sort them
      std::sort(keys.begin(), keys.end());

      // Then emit all the entries in sorted order.
      for(size_t i = 0; i < keys.size(); i++)
      {
        emitter << YAML::Key;
        emitter << keys[i];
        emitter << YAML::Value;
        writeNode(node[keys[i]], emitter);
      }
      emitter << YAML::EndMap;
      break;
    }
    default:
      emitter << node;
      break;
  }
}

// Main function that emits a yaml node to an output stream.
void writeYamlOrderedMaps(std::ostream& out, const YAML::Node& node)
{
  YAML::Emitter emitter;
  writeNode(node, emitter);
  out << emitter.c_str() << std::endl;
}
ivannaz commented 7 years ago

Keeping the order in the same sensible way we originally write them would be nice.

For instance, from this nice example in http://www.yaml.org/spec/1.2/spec.html

--- !<tag:clarkevans.com,2002:invoice> invoice: 34843 date : 2001-01-23 bill-to: &id001 given : Chris family : Dumars address: lines: | 458 Walkman Dr. Suite #292 city : Royal Oak state : MI postal : 48046 ship-to: *id001 product:

We know that after out << node, what we will really get is something like:

tax : 251.42 date : 2001-01-23 ship-to: *id001 product:

From the YAML page: "YAML is a human friendly data serialization".

This unnatural order removes that "human friendly" part of the data serialization process. In my config file I have some 20 main map entries, each one being itself a map of 5 to 20 entries. The resulting file could be manually edited much faster if order is preserved, just because one knows where to look for the items and the mapping order makes human sense.

Yes, "YAML doesn't specify the ordering of key/value pairs in a map, so you're not supposed to rely on it". Neither C++ ISO/IEC 14882:2011 specify some form of code indentation, yet we rely heavely on it for code readability.

ivannaz commented 7 years ago

ops. Seems the spaces in my last comment were eaten away somehow. Sorry for that.

ibaned commented 7 years ago

does #386 and in particular f0b15cd have any effect on this ?

c4pQ commented 5 years ago

668 introduces the desired behaviour.