jbeder / yaml-cpp

A YAML parser and emitter in C++
MIT License
4.91k stars 1.77k forks source link

[Question&Proposition] What is the best way to edit nested structures in place YAML::Node ? #1266

Closed goutnet closed 4 months ago

goutnet commented 5 months ago

Hi,

I am trying to implement a merge method to allow several yaml files to be merged as overlays. It works perfect if the structure is only on depth 1, but when I do the nested version, I realize that the operator[] returns a copy of the node, rather than a reference… therefore it can't be modified in place.

For now, I do a full copy on each merge, which isn't really ideal… but works.

I would like to propose a new YAML::Node::GetNode method to get a rw reference to underlying object, allowing modifying in place:

Node& Node::GetNode(const Key& key)

(and all derivative depending on Key value type)

Would this PR be accepted? or is there a better way to edit nested structures inplace?

jbeder commented 5 months ago

Nodes are reference type objects, even though copies are returned. What code are you trying to do that doesn't work?

goutnet commented 5 months ago

Here is the code that unfortunately does not work (no edit in place is happening):

void merge(YAML::Node base, const YAML::Node& node)
{
    for (auto it = node.begin(); it != node.end(); ++it)
    {
        const std::string& key = it->first.as<std::string>();

        if (base[key].IsDefined())
        {
            if (base[key].IsMap() && it->second.IsMap())
            {
                merge(base[key], it->second);
            }
            else
            {
                base[key] = it->second;
            }
        }
        else
        {
            base[key] = it->second;
        }

    }
}

Did I miss anything?

goutnet commented 5 months ago

to be specific:

base is copied (as object), therefore, the base passed to merge is a new object … not a ref one… and all merge happens on this one…

but that's only a local …

If I try to pass a YAML::Node& base instead ,then the recursive call will fail, because base[key] will return a temporary object (not a ref).

jbeder commented 5 months ago

Interesting. My instinct is not to add this behavior, since I'm not certain it wouldn't have other negative effects. Honestly, I would have expected your code to work, since you can write foo[a][b] = z - but it's been a long time since I've really looked at this code carefully.

Have you looked at #41? I've rejected that PR but you might get some ideas from them.

goutnet commented 5 months ago

IMO, just adding a reference accessor would allow C++ merge.

41 would be a different thing, as they discuss merge from the YAML language point of view, which is a totally different can of worm ^^;

My question would be merge from the C++ API.

For anybody stumbling with the same issue out there, my current workaround is to re-create the YAML node recursively every time … which works but is horribly slow and memory consuming … (at least it gets the job done for now).

@jbeder would you be ok for me to post a PR just adding the reference accessor like mentioned above?

Node& Node::GetNode(const Key& key)

(please suggest any name you would rather have if that is of any concern). The side effect to the current code base would likely be None… as it is just an additional API…

thanks for your help!

goutnet commented 4 months ago

bump?

jbeder commented 4 months ago

So I just tried your code, and it doesn't seem to do what it says:

YAML::Node s = YAML::Load("a: {u: 1, v: 2}\nb: {x: 14, y: 15}\n");
YAML::Node t = YAML::Load("a: {w: 3}\nb: {y: 15, z: 16}\n");    
merge(s, t);  // defined in https://github.com/jbeder/yaml-cpp/issues/1266#issuecomment-1913634860 
std::cout << s << "\n";

This prints:

a: {u: 1, v: 2, w: 3}
b: {x: 14, y: 15, z: 16}

Isn't this what you want? If not, please write down what you want.

goutnet commented 4 months ago

ah, I think I found my problem… it actually does work in the end ^^;

With that said, would you then be open to a PR to add the merge function as part of YAML::Node ?

jbeder commented 4 months ago

The merge function in this thread? Why?

goutnet commented 4 months ago

I am probably missing something, but think merging 2 trees is more of a common operation.

Right now, one can use the above code to allow it, but I feel like having this as a normal operation supported by the YAML::Node would make more sense.

(I would obviously transform this as a member function of YAML::Node and change the signature, such as)

enum MergePolicy {
   MergeMaps = 0,
   ReplaceMapsContent = 1<<1,
   MergeSequences = 0,
   ReplaceSequenceContent = 1<<2,
};
YAML::Node& YAML::Node::merge(const YAML::Node& src, enum MergePolicy policy=MergeMaps|ReplaceSequencesContent);

Would you be interested in a PR like so?

(The reason I prefer discussing a PR first, is I can see ~45 pending PR on this project, just trying to make sure if merging is an option ;) )

jbeder commented 4 months ago

It's such a small amount of code and for a somewhat niche case, I'd just as soon leave it out.

goutnet commented 4 months ago

roger that.

thanks for the information then.