Closed GoogleCodeExporter closed 9 years ago
This is only my opinion but I dont think we should be proxying integers as
offsets for Pointers and CTypes. This is very confusing as its contrary to the
usual C semantics. Please consider pointer arithmetic for example:
char *a = 0
a + 5 will point to offset 5 but
uint32_t *a=0
a + 5 will point at offset 20!
same goes for structs. If you are going to overload __add__ methods for
pointers, then do it properly and implement the correct pointer semantics.
Equality is also not simple. For example this is undefined according to the C
standard:
char *a=0;
uint32_t *b=0;
a==b -> Undefined!
And same goes for CTypes:
struct foo a=0;
string bar b=0;
struct baz c=0;
a==b -> Undefined;
a==c compares memory content i think.
IMHO We should stick as closely to C semantics as possible.
Original comment by scude...@gmail.com
on 25 May 2012 at 11:06
Yep, that sounds reasonable, so the confusing thing is that Pointers are
NativeTypes, and actually we only dereference them in __getattr__ which
shouldn't be applied to functions like __add__, etc. So I need to figure out
why (long & Pointer) didn't just work. Maybe we haven't defined __rand__?
As to Pointer == CType, we currently don't do anything clever, the CType
doesn't proxy anything and we end up with an exception that doesn't tell anyone
what went wrong. Since people are already trying to use this (for example if
cur.Next == cur), we should look to provide an appropriate exception if we
can...
Original comment by mike.auty@gmail.com
on 26 May 2012 at 7:31
Ok, looking into it further, I've tried recreating the situation John found in
issue 148 and can't reproduce it. Pointers are NativeTypes, which means they
can handle __and__ and __rand__, so I can't figure out how long & Pointer could
ever produce the exception that was reported (I wasn't able to cause the
exception creating my own pointers and trying different operations on them).
As such, I'm not intending to overload the __add__ methods for pointers, so no
change to the existing semantics required there.
As for CTypes, we still need to figure out what to do about handling them.
Equality is currently inherited from BaseObject, so either the offset (since
that's what .v() returns for CTypes) is equal to other, or the class, offset
and vm match. So should we change equality, treat a ctype more as an offset
until its attributes are accessed, or leave it as is and try to make exceptions
easier to understand? Given the way volatility/BaseObject's currently wired,
I'd tend to go with treating it more fully as an offset until you access its
members, but I'd really like to hear everybody else's thoughts on this...
Original comment by mike.auty@gmail.com
on 26 May 2012 at 9:27
Right, so in summary we've currently got two instances of people using cygwin
or windows, on r1794 or r1791 (neither of which had signifcant object model
changes made afterwards) that seem to hit upon a Pointer that for some reason
can't handle being anded with a long (which I think means it doesn't contain a
__rand__ function).
MHL, you seem to have direct contact with someone who can reproduce this, can
you get details of the plugin they're running, whether you can get access to
their image for further analysis and whether you can get them in pdb (probably
with -d -d -d) to gather further information about the pointer comparison
that's being carried out and what's causing it to go wrong?
Possibilities include, I suppose, something that says it's a Pointer object but
for some reason isn't, or a Pointer that's had __rand__ overridden somehow, or
one that didn't properly inherit from NativeType. None of those really explain
what's going on very appropriately... 5:\
Original comment by mike.auty@gmail.com
on 29 May 2012 at 8:15
So this turns out to be a very subtle/sticky issue indeed. Here's how it goes:
0xffffffffffff & pointer
first off tries:
long.__and__(pointer)
which fails (unsurprisingly). It then attempts the following:
pointer.__rand__(long)
pointer.__rand__ has been setup to forward to self.v().__rand__, so we end up
with:
int.__rand__(long)
Which raises NotImplemented (since that would never get called since
long.__and__(int) would work fine, similarly int.__and__(long) would jump to
long.__rand__(int) which again would work fine). This doesn't exhibit itself
on x64 systems because 0xffffffffffff isn't a long, it's just an int.
So that gives us a few options. We can mask the whole problem by casting
pointer to int explicitly (in which case we do hit long.__and__(int) which
works fine, but allows us to compare pointers to strings and find them
equivalent). We could also flip the ordering (so pointer.__and__(long) becomes
int.__and__(long) which fails and calls long.__rand__(int) which succeeds).
That only fixes it for this particular function, not in general. Finally, we
could ensure that proxied types are always converted to longs if they're ints,
so that we never end up in our reverse test with something that may not
succeed. It's a little ugly, and I'd genuinely like opinions as to whether we
should just take the one-off simple fix of swapping the comparators in this
instance, or the longer term improvement of always returning longs when
proxying a number?
The last fix is the one I've attached as a patch, since it's a little more
involved. Please let me know how it goes/what you think, and we can go from
there. Note, this probably won't change the pointer == Ctype issue, so either
way, this won't solve this bug on its own...
Original comment by mike.auty@gmail.com
on 29 May 2012 at 9:46
Attachments:
I think this patch is very subtle and I can guarantee that in 6 months someone
will look at this code and say - hey why do we enforce this to be a long() its
the same as int() anyway (or gets promoted to it).
I think for readability sake we should have the original line as
0xfffff & int(other)
This tells the reader that we dont care what other is, we are going to treat it
as an integer for this operation. Its basically enforcing an interface here.
The point that Ikelos makes re the promotion of a string to an int is valid
(although IMHO not a big deal), which is why maybe the solution is:
0xfffff & other.__int__()
which solves the two issues - its clear what we are going here, and we dont
have a miscomparison to a string.
Original comment by scude...@gmail.com
on 29 May 2012 at 10:12
It also only fixes this issue for this particular comparison, and I'm not sure
that documenting to developers that every time they happen to have a numeric
value and then an operation that they need to call a.__int__() (but not int(a))
because otherwise they may hit a bug which their system may never exhibit
because their maxint is much larger than their test values. It will be much
easier to explain in a comment to "see issue 265" above the long casts.
Developers may never even see it, they may just consume the output values
without a further thought.
What I'd like to think over is whether there really isn't any further thought
required. I know that outputting directly will give an "L" immediately after
the number, I need to check whether our use of "{}".format() saves us from
seeing that. Can anyone think of any other negative implications of all
proxied values becoming longs?
This problem goes away in python 3 because ints no longer have bounds on them,
so this feels like we're headed in the right direction trying to emulate that...
Original comment by mike.auty@gmail.com
on 30 May 2012 at 7:00
Ok, so per the 2.1 release meeting, we're going to close this out as working
for pointer equality, and pointer == CType equality is going to remain
undefined as to whether it works or not, at least for the forseeable future.
Workarounds include directly verifying the pointer's value against the CType's
obj_offset, or dereferencing the pointer directly before attempting an equality
test...
Original comment by mike.auty@gmail.com
on 6 Jun 2012 at 8:28
Original issue reported on code.google.com by
mike.auty@gmail.com
on 25 May 2012 at 10:20