graehl / yaml-cpp-old

Automatically exported from code.google.com/p/yaml-cpp
MIT License
0 stars 0 forks source link

Conversion operators #150

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. This is a feature request;
2. $ ack operator (in the root directory)
3. Look what operators exists

It would be really handy to define conversion operators for some usual cases 
(int, std::string mainly). It might even be possible to template

Thank you for the great work. Love the new API

Best regards,
Jonatan Olofsson

Original issue reported on code.google.com by jonatan....@gmail.com on 23 Jan 2012 at 9:23

GoogleCodeExporter commented 9 years ago
Something like this

Original comment by jonatan....@gmail.com on 23 Jan 2012 at 10:54

Attachments:

GoogleCodeExporter commented 9 years ago
I'd definitely love to have some conversion operators, but there's a silly 
problem with the way that C++ does user-defined conversions. I want

{{{
if(doc["key"]) {
  // do something
} else {
  // key doesn't exist
}
}}}

to be correct - that is, a node in a boolean context should evaluate to whether 
the node exists. Unfortunately, this means that

1) we can't have a boolean conversion (that would turn "true" into true and 
"false" into false), since that would conflict with the existence conversion

2) even an integer conversion would confuse the compiler, so it would turn a 
node with the integer 0 into one that doesn't exist.

3) if there is a templated conversion function, then it would confuse the 
compiler with objects with multiple single-parameter constructors, e.g. 
std::string (since you can construct it with either a std::string or a const 
char *).

That said, it might be nice to have just an std::string conversion. What do you 
think about that?

Original comment by jbe...@gmail.com on 23 Jan 2012 at 6:13

GoogleCodeExporter commented 9 years ago
That might be a possible trade-off. I personally think that it is far more 
reasonable that if(doc["key"]) evaluates to the variable's boolean 
representation of its value, as I believe most people would expect, and an 
existence check is performed elsewise, as something like doc.has("key").

Your point of multiple constructors are of course valid on variable assignment 
The fix, if problem arises, would be to prefix with (type)doc["key"] or use the 
.as<T>() member function i suppose. My guess is that most cases would "just 
work" however.

We should perhaps make a few tests to make sure that a templated conversion 
operator would yield logical results if applied.

One could perhaps also consider implementing operator() and use that in some 
way to separate the two variants if(doc("key")) and if(doc["key"]) but the 
syntax difference may be to subtle. Better to use doc.has("key") then perhaps..

Original comment by jonatan....@gmail.com on 23 Jan 2012 at 6:53

GoogleCodeExporter commented 9 years ago
Perhaps; the fundamental problem, though, is that YAML::Nodes can exist or not 
exist (and it's not always through the operator[] that this happens).

For example,

YAML::Node doc = YAML::Load("");

will give an "empty" document. How would you test whether the document is valid 
other than

if(doc) { ... }

?

Original comment by jbe...@gmail.com on 23 Jan 2012 at 8:59

GoogleCodeExporter commented 9 years ago
I agree that in that case, if(doc) might be expected to be false after a faulty 
load. However, for an invalid load you already throw an exception. Also, since 
if(doc) implicates an empty document, it would be logical to evaluate it to 
false, since it in nearly all cases is not what we wanted - an empty document 
is so to say invalid. if(doc) is thus correct to yield false in both the 
"invalid" case and the "empty" case.

In my opinion, neither if(doc) or if(doc["key"]) need not differ whether doc is 
"false" or not specified (corresponds to a default value of false), while the 
validity check is left to 1) exceptions and 2) member functions (.has("key"), 
and possibly .valid())

Original comment by jonatan....@gmail.com on 24 Jan 2012 at 7:54

GoogleCodeExporter commented 9 years ago
So you're saying

1) that doc["key"] should throw an exception if the node doesn't have the valid 
key? But then how would

doc["key"] = 5;

work?

2) that doc["key"] should evaluate to false when its either a) nonexistent or 
b) a boolean value of "false"?

Original comment by jbe...@gmail.com on 24 Jan 2012 at 6:22

GoogleCodeExporter commented 9 years ago
Hi,

Sorry for not beeing clear enough in my last post =)

1) Basically the only time i think an exception should be thrown is if there's 
some problem with reading the specified file, e.g. it doesn't exist.

2) Yes, I do find that practical in the syntax context where it belongs. This 
would affect mainly the cases of if(var) and if(doc["var"]) and leave the case 
doc["key"] = 5 to the constructors.

The problem with the templated conversion operator here is, as you say, when 
the element does not previously exist, as it would have to return an element 
constructed with the default constuctor which may or may not exist and may or 
may not evaluate to false. 
The standard case is however to convert to bool in if(doc) and thus it is 
perfectly reasonable to return false if the key does not exist.

What i'm saying is that the if(doc) syntax may very well be used for the 
general case you describe in 2), as long as there is another way to separate 
the two cases, namely the suggested member functions i mentioned.

I have the code i posted above in my branch, but I haven't been able to run any 
tests on it yet, so there may be some unexpected results that I'm not aware of. 
I will get back with the results when I have them.

Best regards,
Jonatan Olofsson

Original comment by jonatan....@gmail.com on 27 Jan 2012 at 5:00

GoogleCodeExporter commented 9 years ago
I'm beginning to think that you're right. I'll think more about this, but I'm 
leaning your way :)

Original comment by jbe...@gmail.com on 18 May 2012 at 6:32

GoogleCodeExporter commented 9 years ago
I used my patch in my development code for a while, and it worked fine. I did, 
once, run into one of the problems that you nailed in your first post (i think 
it was a problem with booleans, unfortunately I can't remember. Keeping .as<>() 
as an alternative/fallback in very special cases might be a solution there).

That said, thank you again for an excellent library. I'm currently finishing up 
my Master's Thesis with great help from this, and have had plenty of other 
projects with it just because I like it so much =)

Original comment by jonatan....@gmail.com on 18 May 2012 at 7:38

GoogleCodeExporter commented 9 years ago

Original comment by jbe...@gmail.com on 19 May 2012 at 9:08

GoogleCodeExporter commented 9 years ago
Hmm, reading the above presence issues, perhaps Node's should act more stl like 
(ie a map<any, any>, list<any>, any).  Existence should be probed with find 
member return an iterator.. this is close to FindValue in the old api but the 
stl chose it so I think it wins points on consistency front... it'll prevent 
the need for complicated things like cached lookups too.  Not too bad to write 
in if statements as long as map::operator bool() is defined to test against 
map::end... operator[] should then deference the node as per stl guidelines but 
you get back an any container, not a value.

I also prefer implicit/explicit cast operations or the operator>> to the .as 
member as I explain below. I think sufficient information is given in 
declaration of types such that one should not be required to use .as/casts 
except in edge cases.  One reason why you don't really want this is if you 
change the declaration type later, you have to remember to update the .as 
member call as well or you could truncate off floating point number for 
instance, but this is a silent error.   So I believe this actually increases 
labor and chance at peculiar error.

Still not sure how to deal with pluggable usertype conversion system. It can't 
handle base types of any sort when used with specializations (annoying with 
eigen fixed size vectors/matrices - much of the things I store in yaml files 
for configuration).  It also doesn't like my own code's mixin system for much 
the same reasons.  So for all that I hoped I'd used the new api for... I simply 
don't now because it's too many conversion definitions for too infrequent of 
use... or in other words it ends up being more verbose unless you're storing 
homogeneous collections.  I store a hodge-podge matrix/vector types that are 
fixed size and fixed type... vector(2|3)(d|f|i) are the most common but I have 
more unfortunately and explicitly maintaining conversion operators for each is 
troublesome and nothing but repeats of the same code... and the only way to 
work without explicit repeats is through the use of macros ... and perhaps 
inheritance with template helpers.  Eventually I'll probably give in and use 
the new api but for now I don't think I'll use the new conversion system.

Original comment by nev...@gmail.com on 2 Jan 2013 at 11:17

GoogleCodeExporter commented 9 years ago
@nevion, thanks for commenting! A couple quick clarifications:

1. I'm not sure exactly what you mean by "Nodes should act more stl-like". The 
main difference, as far as I can see, is that

node["foo"]

doesn't throw an exception (whereas the same call on an STL map does). But not 
throwing an exception is necessary if we want to modify nodes:

node["foo"] = "bar";

should add a key/value pair.

Can you clarify what you mean with some example syntax?

2. Another problem with implicit conversions is that you sometimes get 
unexpected ambiguity errors. For example, suppose that our Node had a templated 
implicit conversion:

class Node {
   template<typename T>
   operator T() const { ... }
};

Which then delegates to some sort of conversion functions like we have now.

This would allow us to do something like:

int x = node["x"];
std::string y = node["y"];

But check out what happens if you try this:

std::string z;
z = node["z"];  // error! std::string::operator= can take either
                // an std::string or a const char *, so the conversion is ambiguous

Obviously this isn't what we intended! I'm not sure a way around problems like 
this.

3. What do you mean by "it can't handle base types of any sort when used with 
specializations"? Can you give some examples with code?

Thanks!

Original comment by jbe...@gmail.com on 2 Jan 2013 at 6:38

GoogleCodeExporter commented 9 years ago
Re 1: http://www.cplusplus.com/reference/map/map/operator%5B%5D/ so

node["foo"] = "bar" will add key/pair foo,bar if it didn't exist or update the 
existing node.  Did I miss something?

As for more stl like... I mean implementing most of the same concepts in a 
compatible way.  Example from old api is lack of find though you had FindValue. 
 Use of first-class iterators instead of pointers is another thing.

Re 2... that is within my expectations, sort of.  It works fine if you say z = 
(std::string) node["z"]. I believe this is how libconfig++ works too (not 
saying that it's right but it is a sample point).  But it still leaves a 
problem for Configuration objects which decouple yaml in code to POD types.  I 
admit still presents me with the same ongoing annoyance.  Is this just a 
question of adding a templated operator=?  Where does fall short if it was 
visited before?  How about operator>> again?  What about the presence both 
operator>> and implicit conversions - one to handle declaration assignments and 
the other for previously declared?  Also I like casting more than "as" even so 
:-)

3) I may have over generalized but reading the definitions of the templates you 
declare a default constructed T on the stack, pass it by reference to 
conversion functions which isn't going to fly for generalizing conversions for 
an inheritance hierarchy.  For Eigen, every type is completely different thanks 
to templates but there is some inheritance going on still which is used 
religiously within the library to make things work without much headache.  If 
you could somehow add a templated conversion operator that itself is a template 
specialization for MatrixBase<Derived> where Derived is what you are 
converting, well I don't think this will fly though... see eigentest.cpp  Also 
sometimes vectors/matrices are fixed size and sometimes they're dynamic.  
Perfectly detectable at compilation time and this can be written once to handle 
all cases...   Plugging the solution into the conversion system is still a head 
scratcher though.

An more general example shown by my mixins objects:
//Pure virtual
class CamInfoBase{
int sensorId;

virtual Vector2d distort(Vector2d p) { return p; };
virtual void _load(const Node &p){ sensorId = p["sensorId"]};
virtual void load(const Node &p){ CamInfoBase::_load(p); };
};

//sometimes this is all you need, like pure simulation
class PolyCalibCamInfo{
typedef Eigen::Matrix<double, 10, 1> vec_10d;
vec_10d x, y;
virtual Vector2d distort(Vector2d p) { return p; };
   //complicated polynomial math ommited;
   return f(x,y,p);
}

virtual void _load(const Node &p){ /* loops to load x and y from p just like 
vec3*/};
virtual void load(const Node &p){ CamInfoBase::_load(p);  
PolyCalibCamInfo::_load(p); };
};

};

class ASCamInfo : public CamInfoBase, public PolyCalibCamInfo{
int board;
int channel;

virtual void _load(const Node &p){ board = p["board"]; channel = p["channel"]; }
virtual void load(const Node &p){ CamInfoBase::_load(p); ASCam::_load(p); };
};

//The following is insufficient to allow use of CamInfo with the Node.as 
conversions... it'd be nice if it was

namespace YAML {
    template<>
        struct convert<CamInfoBase> {
            static Node encode(const CamInfoBase &rhs) {
                Node node;
                camInfo.load(node);
                return node;
            }
            static bool decode(const Node& node, const CamInfoBase& rhs) {
                return camInfoBase.store(node);
            }   
        };  
}      

Perhaps a solution to this is to have an overloaded as member that takes an 
object by reference (I'm starting to get reminded of boost::program_options 
which uses a value wrapper and it's own pluggable conversion system, it also 
uses the .as convention.. but things tend to be simpler there than the 
expectations of yaml-cpp). The Mixins are a detail but it's the inheritance 
that gets you in trouble.

Original comment by nev...@gmail.com on 3 Jan 2013 at 12:38

Attachments:

GoogleCodeExporter commented 9 years ago
Ah, I misspoke for #1. node["foo"] doesn't throw an exception in the STL, but 
it does always creates a key/value pair. In the new API, if you just call 
node["foo"] (e.g., when querying if the value exists), it doesn't create a 
key/value pair.

Sorry for the mistake. In any case, you're right that a find() method that 
returns an iterator would be useful. I made a new feature request, Issue 181.

2. A templated operator= wouldn't work because it would have to be defined on 
the receiving type, not Node. What about, as a compromise, something like this:

int x = foo["x"]();                     // either
string s = foo["s"].as<std::string>();  // works
string t;
t = foo["s"](); // still an ambiguity error, since we can't get around it

In other words, calling operator() on a Node turns it into a type with a 
templated cast operator?

3. It looks like the problem is inheritance. yaml-cpp is designed with value 
types in mind, which don't really play well with inheritance. I don't really 
want to encourage something like

camInfo.load(node)

although you certainly could do that if you wanted. Is your complaint that you 
have to explicitly define a separate convert<> specialization for ASCamInfo? If 
so, I would use a macro (#undef it when you're done to avoid name pollution).

Basically, I think of YAML as data, which is copyable and built via 
composition. If you want to use inheritance in your program, I have no problem 
with that, but yaml-cpp's not going to encourage you :)

Original comment by jbe...@gmail.com on 3 Jan 2013 at 8:33

GoogleCodeExporter commented 9 years ago
Hmm re 1 & 2 I don't think node["foo"] for querying is worth it.  find is good 
enough and the consistent thing to do in c++. "find/has" are the defacto things 
to use with maps in any language so it feels weird to deviate. For "any" types 
(Nodes) they could also have a empty member (again a defacto for these kinds 
containers). With the existing operator (bool) feature gone you can then define 
a templated cast operator for Node so declarations are less awkward.

Really sucks there's no good strategy for an existing declaration. I separate 
configuration io via POD types for decoupling purposes (camInfo's are a more 
complicated technically non POD type - violating POD because I'm trying to keep 
the repeats low in io routines by using inheritance to override the load 
member).  On these truly POD types though I'm going to end up repeating field 
types every line which feels like a regression in capability of yaml-cpp.  Is 
it possible to resurrect operator>> for built in types or use pluggable 
configuration?  This seems to me to solve the declaration/assignment issue in a 
reasonable way with the cast operators.

For 3 ok I can see why you don't want to redirect to load types but it's still 
a crying shame for Eigen types... but I'll spend some time trying it out and 
see how things go.

Original comment by nev...@gmail.com on 4 Jan 2013 at 6:04

GoogleCodeExporter commented 9 years ago
I think I'm coming around to *some* sort of inferred-type function. Maybe 
operator>>, or maybe a named function. Which would you prefer:

int x;
node >> x; // 1
node.load(x); // 2
node.read(x); // 3
node.update(x); // 4

or something else?

In this new incarnation, users wouldn't specialize on this - it would still 
work with the new convert<> templates. Basically, it would be defined like this:

template<typename T>
void operator >> (const Node& node, T& rhs) {
   rhs = node.as<T>();
}

What do you think?

Original comment by jbe...@gmail.com on 5 Jan 2013 at 6:55

GoogleCodeExporter commented 9 years ago
Yes that's basically what I had in mind for existing declarations.  I prefer 
(1) node>>x; 

You save a line per variable (for local variables) if you also allow cast 
operators which take care of new declaration assignments on a single statement. 
 For consistent operation you'd have to drop the existing operator (bool) 
though.  The next best thing would be having *node return something that allows 
these cast operators.   operator() doesn't fit in my opinion.

Original comment by nev...@gmail.com on 5 Jan 2013 at 9:38

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I have implemented the insertion operator solution as a templated free function:
template <typename T>
inline const Node& operator>>(const Node& lhs, T& rhs) {
  rhs = lhs.as<T>();
  return lhs;
}

This allows using the operator to assign booleans, while still facilitating 
checks for non-existence.  For example:
bool flag;
if(node["flag"]) { // existence check
  node["flag"] >> flag;  // boolean assignment
}

This change, along with updates to the relevant tests is contained in the 
attached patch.

Original comment by hacker.r...@gmail.com on 14 Jul 2014 at 12:12

Attachments:

GoogleCodeExporter commented 9 years ago
Issue 249 has been merged into this issue.

Original comment by jbe...@gmail.com on 24 Jan 2015 at 11:26