Closed N-Coder closed 2 years ago
Yes, storage somewhere is needed. The only alternative that I can think of, is to use an "extended" info, same as done for CPPInstance
. Problem is how to update the outstanding pointers, otherwise both versions (with and without) would be around for a while. One solution would be to capture other little-used or on-demand constructable pieces (e.g. fName
and fDescription
) and put them together with fDoc
in an on-demand extension. Note that if only one is ever used, the pointer to the extension can be the pointer to the current use. Only if both need constructing is an extension needed.
CPPInstance
actually has a __dict__
and CPPExcInstance
would be ephemeral enough not to need __doc__
. In both cases, the better place would be in CPPScope
. CustomInstanceMethod
would likewise have the problem of being ephemeral (CPPOverload
and TemplateProxy
differ here: their innards are not only shared for reduced memory usage, but also for settings to survive if assigned on a temporary, e.g. bound, copy). TypedefPointerToClass
is fine to grow a __dict__
: it's an uncommon beast.
My use-case usually doesn't involve memory pressure, so I'd be happy with the simple solution, but I guess a more involved implementation can always be added later on. :sweat_smile:
Do you have an example where TypedefPointerToClass
occurs? Then I could add and test that.
I just created a small test case and found some oddities (scroll up in the gist to see the output):
CPPDataMember
'field' is listed and accessible in __dict__
, but not accessible via inst.field
or getattr(inst, "field")
TemplateProxy
methods start out with the docstring "cppyy template proxy (internal)" instead of the signatures of the represented methodsTemplateProxy
s seem to use their __doc__
string as output for __repr__
instead of something like <cppyy.TemplateProxy object at 0x7f1d57a15f00>
(see the dumped __dict__
s at the very begining of log, you need to scroll horizontally to find templ
)help()
prints the __doc__
of TemplateProxy
s twice, but that might be due to the previous point:
| templ = Description of the templ.
| Description of the templ.
Should I try to fix those points, too?
Do you have an example where
TypedefPointerToClass
occurs? Then I could add and test that.
Here's an example (it's a common one for opaque handles):
import cppyy
cppyy.cppdef("""\
namespace doctest {
struct A{};
struct B {
typedef A* A_t;
}; }""")
ns = cppyy.gbl.doctest
a_ptr = ns.B.A_t(0x2a)
assert cppyy.addressof(a_ptr) == 42
And yes, TemplateProxy
is a mess, but that's b/c it can grow (as more methods are instantiated) and it also represents non-templated overloads, specializations, etc. None of these need to have the same signature. And yes __repr__
calls tpp_doc
. Same reason: not much better to print in repr()
as there isn't really a reconstructable state.
Okay, I implemented that and updated the testing example. There are two things I'd still like to see fixed, afterwards this PR should be good to go. Have a look at the example __dict__
:
<class cppyy.gbl.Test at 0x55b79cb89e70> {'__module__': '__main__',
'__dict__': <attribute '__dict__' of 'Test' objects>,
'__weakref__': <attribute '__weakref__' of 'Test' objects>,
'__doc__': None,
'templ': cppyy template proxy (internal),
'__assign__': <cppyy.CPPOverload object at 0x7f7673be1580>,
'__init__': <cppyy.CPPOverload object at 0x7f7673bda840>,
'method': <cppyy.CPPOverload object at 0x7f7673bda8c0>,
'overload': <cppyy.CPPOverload object at 0x7f7673bdd800>,
'field': <cppyy.CPPDataMember object at 0x7f7673bdd830>}
Here, all reprs say <cppyy.Something object at 0x123456789abcdef>
, except for TemplateProxy
, which prints its __doc__
.
I'd unify this so that TemplateProxy
uses the same repr as all the other objects.
As a second point, field
is listed in the __dict__
output, but calling cppyy.gbl.Test.field
will throw a not-found exception. To access the attribute, you need to use cppyy.gbl.Test.__dict__["field"]
, which is weird. I'm not quite sure how to fix this, but it's probably somewhere in meta_getattro
.
The TemplateProxy.__doc__
issue should be fixed now. I couldn't find an easy fix for making the descriptor Test.field
accessible, the full error for that is:
<class cppyy.gbl.Test at 0x55e9b851c020> has no attribute 'field'. Full details:
attribute access requires an instance
'Test::field' is not a known C++ class
'field' is not a known C++ template
'field' is not a known C++ enum
The reason for this is CPPDataMember::GetAddress
returning a nullpointer when no object is given, the stack trace in this case is:
CPyCppyy::CPPDataMember::GetAddress CPPDataMember.cxx:336
CPyCppyy::pp_get CPPDataMember.cxx:51
type_getattro typeobject.c:3380
CPyCppyy::meta_getattro CPPScope.cxx:309
cfunction_vectorcall_FASTCALL methodobject.c:430
Unfortunately, I don't understand the logic here. @wlav could you change it so that the following code works? Otherwise the PR should be ready.
import cppyy
cppyy.cppdef("""
class Test {
public:
std::vector<int> field;
};
""")
assert cppyy.gbl.Test.field is cppyy.gbl.Test.__dict__["field"]
Unfortunately, I don't understand the logic here. @wlav could you change it so that the following code works?
Yes, I actually ran into the same issue when trying to add a property to fields to set the encoding of char*
and std::string
fields. Other than removing that check, I saw no solution. The check is useful in principle in that it prevents lookups of instance data members on the class (there is a test for that), but I note that there is no such equivalent check of looking up static data members on the metaclass.
Since the other code isn't remotely finished (lots of other changes to converters, which need to know the encoding in the end), I recommend to simply remove the check for now to make progress. I'll figure out my local changes and the changes here later.
I guess removing the check won't fix things easily, as the following code would then have troubles with the object instance being null. For me, if you plan on cleaning that up some time later that's just as well, as the workaround via the __dict__
works in the meantime and I just wanted to make sure that the behaviour is consistent in the end (i.e. whenever you get to merge your converter changes).
From my side, these changes should then be good to go now. But you might want to give them another closer look as I'm not 100% confident I did the CPython refcounting right.
Another thing we might want to think about next to the __doc__
stuff are __annotations__
used for type checking. It should be easy to know the types coming from C++ and this would be a nice mechanism to expose them to type checkers and IDEs. But maybe I should open another issue for this...
From my side, these changes should then be good to go now. But you might want to give them another closer look as I'm not 100% confident I did the CPython refcounting right.
will do as the week opens up later, now that the big conference is done.
As for __annotations__
... I tried, but couldn't figure out a good way to handle overloads. There are two issues that I could not find a solution for:
double, double
and int, int
does not mean that double, int
is allowed)double
v.s. double, double
; really a variation of 1., but actually worse if an IDE would insist on an extra parameter).Any idea how to tackle these?
As-is, the only place where __annotations__
is used, is in template resolution. If a templated function takes a templated function (std::function
or pointer) as argument, e.g. like so:
template<typename R, typename... U, typename... A>
R callT(R (*f)(U...), A&&... args) {
return f(args...);
}
and the actual argument given is a Python function, e.g. like so:
def ann_f2(arg1: 'int', arg2: 'int') -> 'int':
return 3*arg1*arg2
cppyy.gbl.callT(ann_f2, 42, 37)
then the types from the __annotations__
are used to instantiate the template. (Although it would be theoretically possible to figure out from the function body that the arg pack A
is equal to U
, there is truly no way to derive the return type R
otherwise.)
Python unfortunately has no built-in support for function overloading. There is only a @singledispatch
decorator that differentiates based on the type of the first argument, but this decorator doesn't provide any __annotations__
or other runtime typing information about the overloaded implementations. The @overload
decorator can unfortunately only be used in type stub files to tell type-checkers using non-__annotations__
based analysis (i.e. directly parsing the stub file, not via live runtime object inspection) that a function accepts different pairs of argument types. I also couldn't find any other example where this kind of information was exposed during runtime via __annotations__
.
So I see two (mutually non-exclusive) ways of going forward:
__annotations__
in a best-effort kind of way, e.g. throwing overloaded arguments into a Union
and marking arguments not present in all overloads a Optional
. This will be far from perfect, but should already help a lot in the most common cases and where runtime type inspection is used, e.g. in ipython shells and jupyter notebooks.But maybe this is something for another issue and we should try to get the __doc__
changes in first.
Yes, I'm thinking the same: the only problem that my pop up is confusion, so perhaps limiting "best effort" to cases that are clear maybe enough.
I'm going to merge this PR later today and fix up the remaining issue (data members) directly in the code. I find that easier than discussing on a PR. :) I'm really hoping to get my Windows VMWare up and running again so that i can cut a release (last one was from July ...).
See my initial issue here. Regarding your comment:
...I took the easy path and did exactly that to
CPPDataMember
(where I could add the member field directly to the class),TemplateProxy
, andCPPOverload
(where I added the member field to the persistentCPPOverload::MethodInfo_t
/TemplateInfo
object and adapted the__doc__
getter of / added a setter to the enclosing object). By default, the getters return the auto-generated docstring, unless the user overwrote it with a custom value. I guess you need to store the string value somewhere, unless you want to give the objects their whole, customizable__dict__
withtp_dictoffset
, but you'd also need to store that somewhere. I'm unsure whether something similar needs to be done toCustomInstanceMethod
,TypedefPointerToClass
,CPPInstance
, andCPPExcInstance
, as I didn't have a class at hand that wraps those. At least the current changes in this PR seem to cover my use-case from the Bitbucket issue.Regarding the hook for
CPPScope.__getattr__
discussed here, I guess I would also add a PyCallable member toCPPScope
that gets called here in the native implementation ofmeta_getattro
to provide a replacement value if no other alternative could be found.