Open simonask opened 8 years ago
When it comes to the macro based approach, I am not sure the drawbacks are entirely correct:
Table/row metaphor (rather than "object" oriented).
That is just leftovers of the old terminology. There should be no reason that it could not be made as fully "object" oriented as any other method.
No high-level features (such as what's provided by Object Store).
I am also pretty sure that we could find ways to support features like links, primary keys and other annotations within the framework of macros.
For me the drawbacks are more the unfamiliar syntax (and lack of auto-completion to help you remember it), and the horrible experience of stepping into the macros when debugging (but the same could potentially be said about template based approaches).
Another huge drawback with the macro based approach is that it does not allow users to add their own methods to the objects, forcing them into an Anemic Domain Model :-(
@astigsen Good points — I will admit that I'm subconsciously almost writing off the macro approach at this point, and perhaps I'm therefore not doing justice to its merits. It still seems to me that it's hard to achieve an "object-oriented" model where users can do things they might want (such as define methods on Realm objects, as you mention).
The Java binding has suffered from the "Anemic Domain Model" for a long while (great name by the way!), and it has been a huge source of frustration for their users, so that should be taken into consideration as well.
I think that the reason I dislike macros is a very simple feeling of "now, my code would not be processable by a tool". Of course, a very sophisticated tool would still be able to, but at least the feeling that I am not relying on the pure syntax of the language gives me that feeling. Given that, the Property<T>
approach looks most promising to me.
Regarding anemic models, I have to say I was quite astonished to see Fowler discouraging the pattern, because I feel that there is a trend towards it, at least in enterprise .NET but also more generally as it looks more like what you would do in a functional language. Rich objects tend to be hard to test and in general look like little programs with global variables. I wonder if he would still agree with his assertions today. I realize, though, that a lot of people work this way and expect to be allowed to do so, so blocking them is a problem.
I wonder i C++ developers are so opinionated about their architecture that they would be happier with a library rather than a framework -- a bunch of functions that they could call from whatever architecture they already have in place, rather than being forced to inherit from something when they already have all of their classes derive from some Actor
class or whatever? I know that it would be less magic than the other bindings, but then, C++ developers are a special breed anyway...
I tackled all these issues in OOFILE with similar debates (the arguments were identical 20 years ago). I wrote up a bit of a comparison on our wiki possibly worth looking at least at the model bit.
My solution to the property issue was to use my own persistent base classes (eg: dbInt
) so that I could use operator overloading. That's not as obnoxious to people as you might think if they are very lightweight classes seen as just an API to the storage - people don't necessarily expect real integers when they are persistent.
From observation of other products, yes, I think having a parsing tool like Qt can be a nightmare for maintenance. In particular, parsing binary formats is a massive overhead (I think Poet did this and was always annoying people by being behind compilers).
I think @kristiandupont makes a good argument about people maybe being happier with a library than a framework. One of the things which worked really well for some of the diverse OOFILE users was exactly that - you could use generic calls to get, set, search etc. without using the convenience of the class declarations which gave you type-checked operations.
Some various low-boilerplate options that don't require reflection:
// tuple-style with unnamed fields
class MyObject : public realm::Object<int, float> {
};
MyObject obj = ...;
get<0>(obj) = 5;
float value = get<float>(obj); // would be a compile error if there are multiple float properties
// named tuple using externally-defined property types
REALM_PROPERTY(int, foo);
REALM_PROPERTY(float, bar);
class MyObject : public realm::Object<foo, bar> {
};
MyObject obj = ...;
obj[foo()] = 5;
float value = obj[bar()];
// named tuple using constexpr string hashing
class MyObject : public realm::Object<prop(int, "foo"), prop("bar") = 1.5f> {
};
MyObject obj = ...;
int value = obj[prop("foo")];
// macro to generate the properties rather than the entire class
class MyObject : public realm::Object {
REALM_PROPERTIES(
int, foo,
float, bar)
};
MyObject obj = ...;
int value = obj.get_foo();
Great to see some new suggestions!
@tgoyne As far as I can tell, your second suggestion can also support a statically checked interface, along these lines:
REALM_PROPERTY(int, foo);
class MyObject: public realm::Object<foo> {
};
MyObject obj = ...;
get<foo>(obj) = 5;
I also think something like aliasing a field (assigning different field names in code and in Realm) will be a very common thing for people to do in C++, because most companies use some kind of prefix or suffix for data members that they will not want to put inside the Realm, particularly if they're syncing it to other bindings. I haven't really seen any constexpr string handling that worked across all major compilers without very awkward syntax, but my knowledge could be outdated - do you have any info on this? I suppose at least C++14 is required.
All of the ideas I threw out would give a fully statically typed interface. In the case of the second, there'd be a different overload of operator[]
for each of the the properties, and the actual passed-in value wouldn't be used for anything.
Just to clarify how the parent property approach works that I use in OOFILE and the pattern can be used here, class declarations look like simple class decls with our own types instead of native and then you don't have to use property syntax to access them. This uses pure compile-time logic working on any compiler and very conservative C++ (even though we can mandate C++11 at least).
The secret is to use (pre-thread) registration by constructor - at the time of constructing a rInt
we are inside the constructor of a realm::Object
so can connect the two to build up the schema.
// properties using special Realm classes
class MyObject : public realm::Object {
rInt foo;
rFloat bar;
};
MyObject obj = ...;
int value = obj.foo;
The secret is to use (pre-thread) registration by constructor - at the time of constructing a rInt we are inside the constructor of a realm::Object so can connect the two to build up the schema.
That is actually an interesting technique to collect the information needed for introspection. It does seem very timing dependent though. Are we sure that it won't be possible to end up with some corrupted state?
Even if it works, you still need the property names. Maybe it would be an idea to combine it with a simple macro:
class MyObject : public realm::Object {
REALM_PROPERTY(int, foo)
REALM_PROPERTY(float, bar)
};
Looks ugly compared to the above method of using special property types, but might be hard to get that information otherwise?
If you are stashing the information with thread-local stuff it cannot be timing dependent because your property construction is guaranteed to be occurring immediately after that particular class's base realm:Object
constructor. Base and member constructors can't be interweaved in a thread.
The current OOFILE doesn't use thread-local storage but this was never a problem in extremely broad use (19 different compilers, most countries, hundreds of thousands of end users).
You don't need property names ever if you never have arguments needing string names as you can manufacture column names. That's how OOFILE supports people generating dynamic schemas. However yes sometimes it's useful to pass in a name or other settings such as indexed attributes. (Also if you have an old-fashioned backing store you may need character field widths!).
Because these are now classes rather than raw ints you can easily have C++ init lists on them. I need to test what modern syntax would allow but I think something like this would work (qualified answer with too much C# for months):
// properties using special Realm classes
class MyObject : public realm::Object {
rInt foo {"Foo", 0};
rFloat bar {"Bar", 42.0};
};
Also keep in mind that we need to match column names in cross-binding usage scenarios (such as sync). :-)
This also means that things like the table name must be customizable in some way.
Putting the macro inside the class, like
class MyObject: public realm::Object {
REALM_PROPERTY(int, foo);
};
doesn't leave us any opportunity to enumerate the properties of an object before having an actual instance of the object. Perhaps that might be good enough.
@AndyDentFree I'm curious what the exact technique is that you used to connect the field to the object in the member constructor. Do you let each constructor modify global/thread-local information? How do you know when all members have been enumerated?
Perhaps an approach like what V8 does can be employed, where adding a new property causes a new "type" to be generated, and each type contains a map of which new properties lead to which new types. This is particularly well suited for prototype-based languages like JavaScript.
To expand, the way this would work is that every time a realm::Object
is instantiated it starts with the "unit type". When a property is detected because a realm::Property
member is being constructed, the offset and type of that property is looked up in an internal mapping inside the current type info of the object to see if it already knows about a different type that corresponds exactly to its own members plus the one that is being added. If it doesn't exist, that new type is created. If it does exist, the object has its type changed. And so forth. (In addition to members, other distinguishing features like the table name and the primary key column would have to have the same effect.)
This would either need to be completely thread-safe, unless we are fine with objects created on different threads having duplicate runtime type information.
@AndyDentFree The technique of using thread local storage to have a pointer to the base class so that it can be obtained secretly by the constructor for a property may break down, when the initialization of that property can in itself imply initialization of further objects. Something I guess could be a reasonable pattern if the property was a form of link. Or did I misunderstand your approach?
@finnschiermer
may break down, when the initialization of that property can in itself imply initialization of further objects.
OOFILE doesn't support nesting objects which I think you're pointing out would be a failure of this paradigm. (I suspect the paradigm would still work with some kind of stack and level count but would be a lot more fragile.)
It manages relationships with special properties for managing links similar to Realm. That included ownership so we had cascading deletes with an opt-out mechanism.
I wrote up a small ~100 line proof-of-concept in C++14. Could you read through it and let me know if it more or less matches what you had in mind? (see line 121 for an example of what the object definition would look like to the user) @AndyDentFree
Gist here: https://gist.github.com/simonask/2f22c00437f1fd5161dc
The relevant part for people just watching:
class MyObject: public realm::Object {
realm::Property<int64_t> m_integer = property("my_integer");
realm::Property<std::string> m_string;
};
I have to say, it looks amazing and very "magical" -- with zero macros! I like that! However, it also imposes some runtime overhead on object creation, and I'm wondering how much we could do to eliminate that. My gist makes zero attempts to reducing runtime overhead, and has one lookup in an std::map
per property per object instance. Observing that 99.9% of all objects will follow identical type transitions, this can probably be optimized to be almost unnoticeable.
A few perspective thoughts rather than thinking about syntax:
note: I don't have time (very tight dotnet guideline) to be too distracted by this stuff so would rather think about it more and reply next week. Lack of more replies does not denote lack of interest! I want to go play with some template ideas when we have the RC featureset for dotnet delivered.
This is a very fascinating discussion! 💯 :woman_technologist:
There are a couple of considerations that need to be taken into account, some of which apply to other bindings as well, and some of which are unique to the C++ binding.
The problem we need to solve is this: How do we give users type-safe and convenient access to objects stored in Realm?
To answer that question, there are several subproblems that need to be answered:
These decisions should be informed by the imagined use cases for Realm C++, which are slightly more diverse given the nature of the apps in which people choose to make use of C++. Realm C++ might have to coexist with existing, diverging implementations of things like reflection (for example, game engines generally have this in some form), and users of Realm C++ are very likely to already have strong opinions about how their memory should be managed.
In the following, I am mostly concerned with the user experience that we provide, and not so much about what our internal reflection APIs would look like. For the relevant scenarios, presume an API that looks similar to C++'s own
type_info
set of functions, but with added information about struct fields.Reflection Proposal 1: Macro-based structs
An nearly-sufficient implementation of this already exists in Core in
src/realm/table_macros.hpp
.Pros:
However, this approach has several severe drawbacks:
Example:
Reflection Proposal 2: Qt-like Meta Object Compiler
This would go in a different direction, where users would be able to define their classes like normally, and then add a minimal amount of annotation. We would then ship a tool that parses the header files defining these objects and outputs some C++ code providing the guts of the reflection machinery.
A base class
realm::Object
containing the row accessor is presumed.Pros:
Cons:
Example:
Reflection Proposal 3: Pure Template Based
It is possible to achieve something fairly elegant using only standard C++ primitives (no macros), but it does require a small amount of boilerplate. The upside is that it makes it easy to map existing classes and objects to Realm.
Pros:
Cons:
Example:
Proposal: Achieving Zero-Copy Semantics
In general, the problem is that we can't reliably intercept get/set operations on fields in an unobtrusive way, since C++ has no such thing as "properties" in other languages. We could introduce a special
Property<T>
type that can be used to solve this.The class
Property<T>
would be defined as such:When getting and setting the value of the property, we would use the
offset_in_object
to find the beginning of the encapsulating object ((realm::Object*)((char*)this - offset_in_object)
), which would allow us to find both the (dynamic) type of the encapsulating object as well as which property is being accessed, so that we can issue the appropriate function calls into Core.The
realm::Link
property implicitly already has the same functionality, so doesn't need to be wrapped inrealm::Property
.(This technique would also work in the MOC proposal.)
Pros:
this
pointer without any manual step to put the data into Realm.Cons:
Property
members needs to be thought through -- it may be the case that a bit of trickery is necessary to support the expected behavior in user-defined constructors.Example:
Proposal: Multiple Inheritance
I propose that we do not support it. :-)