Open gacholio opened 4 years ago
@dmitripivkine @amicic @mstoodle @youngar @rwy0717
I think this change is important to avoid accidental pointer arithmetic, a mistake I have made in the past.
I personally am not a big fan of defining pointer types like fomrobjectptr_t
, when you can just use fomrobject_t *
. I'm not hung up on this, and we already define types like omrobjectptr_t
.
Is there anything stopping us to jumping right to step 3? If we fix everything in OMR, OpenJ9 can do:
-#if defined(OMR_GC_FULL_POINTERS)
-typedef UDATA fomrobject_t;
-typedef UDATA fomrarray_t;
-#else /* OMR_GC_FULL_POINTERS */
-typedef U_32 fomrobject_t;
-typedef U_32 fomrarray_t;
-#endif /* OMR_GC_FULL_POINTERS */
+struct fomrobject_t;
+struct fomrarray_t;
I believe there are still places that declare and use fomrobject_t
directly, so the type has to remain defined as an integer type - it certainly can't be opaque.
Yeah, I'm seeing that now :(. I'm going to take a quick look to see how hard (or impossible) it is to fix these places in OMR and OpenJ9.
It would mean replacing the non-pointer fomrobject_t
with uintptr_t
I think (which would be fine with me).
That is my conclusion as well, it seems like a clean solution. I think I like this better.
This makes sense to me, and seems like a good change.
Personally, I've always thought the name fomrobject_t
was confusing. Adding a type called fomrobjectptr_t
just adds to the confusion. It's too similarly named to omrobjectptr_t
. In my head, they both sound like pointers to objects. I understand it's beyond the scope of what you're doing, but, it would be nice if we could choose clearer names.
I'll just throw out a suggestion:
fomrobject_t
=> omr_ptr_field
omr_opqaue_fomrobject_t
=> omr_abstract_field
fomrobjectptr_t
=> omr_field_addr
Long term, does it make sense that we would stop using fomrobject_t
directly? I don't see much point in holding on to an encoded field; you would always want to load+decode simultaneously, right? Maybe omr_ptr_field
could be used internally by GC_SlotObject
, but the majority of the GC would work with omrobjectptr_t
or omr_field_addr
's instead.
@rwy0717 There might be a couple places where you don't have to decode a field, for example checking if two pointers are equivalent (OpenJ9's ObjectAccessBarrier.cpp).
Edit: We can still just cast the value in a field to uintptr_t. Maybe we could add a type for documentation:
typedef uintptr_t omrobjecttoken;
Are there in fact any downstream GC implementations other than OpenJ9?
I don't know for sure, perhaps there is a version of SOMpp that is still being maintained, but I'm not aware of any active users.
EDIT: I've been reminded by others (thanks!): there is also t4, and pypy integration in development.
I started to remove instances of fomrobject_t
from OMR, here is what I have so far: https://github.com/eclipse/omr/pull/5033
In order to get this to work with runtime compressed references, I will have to replace sizeof(Slot)
and sizeof(ObjectHeader)
with something runtime. This will probably be a larger refactoring.
Yeah, I haven't updated the example GC for mixed yet.
I have not put much thought on this, but so far i'm 1) in favor of the original proposal 2) not in favor of replacing specific types like fomrobject_t with generic types like uintptr_t
3) somewhat in favor of renaming fomrobject_t (numerous times I stopped for a few seconds asking myself what 'f' stands for)
How about this:
MM_SlotPtr
. This will be a typedef to void *
. You cannot dereference or do arithmetic on void *
values.fomrobject_t *
with MM_SlotPtr
.MM_PtrToken
. This will be a typedef to one of:
uint32_t
(compressed mode)uintptr_t
(full mode)uint64_t
(mixed mode)fomrobject_t
. Local fomrobject_t
values will be replaced with MM_PtrToken
values.You cannot dereference or do arithmetic on `void *` values.
I was under the impression that gcc allowed arithmetic on void*
, so this may result in errors slipping in since we almost always use gcc for PR testing.
Why not leave it as opaque*
?
* `uint64_t` (mixed mode)
uintptr_t
will do for mixed as well.
I was under the impression that gcc allowed arithmetic on void*, so this may result in errors slipping in since we almost always use gcc for PR testing.
Oh, you’re right, it’s only a warning on GCC 5 +.
Why not leave it as opaque*?
Mostly, because void*
is a simpler solution. I’m more than happy to introduce an opaque type, since it will help catch more errors.
uintptr_t
will do for mixed as well.
Yes, uintptr_t
is the same type as uint64_t
in mixed mode. I thought it might be more obvious to explicitly map each mode to a type. If you wrote the definition like this...
#if defined(OMR_FULL_POINTERS)
typedef uintptr_t MM_PtrToken;
#else
typedef uint32_t MM_PtrToken;
#endif
...I’d have to think about it for 1/4 of a second. It's fine, though.
So, with that in mind, new plan:
MM_Slot
. This will be an opaque type.MM_SlotPtr
. This will be typedef'd to MM_Slot *
.fomrobject_t *
with MM_SlotPtr
.MM_PtrToken
. This will be a typedef to one of:
uint32_t
(compressed mode)uintptr_t
(full and mixed mode)fomrobject_t
. Local fomrobject_t
values will be replaced with MM_PtrToken
values.I'm not sure we really need MM_SlotPtr
, since we can just use MM_Slot *
, which is obviously a slot ptr.
MM_
prefix is normally for Memory Managment (GC) code local types. These types would be visible across VM code.
I'm not sure we really need
MM_SlotPtr
, since we can just useMM_Slot *
, which is obviously a slot ptr.
Agreed.
OK, maybe omrobjectslot_t
and omrobjecttoken_t
, to match omrobjectptr_t
?
I personally like omrobjectslot_t over fomrobject_t. 'f' (what stands for 'field'?) at the front moved toward the end and replaced with 'slot'.
I'm not quite following what omrobjecttoken_t would be: is it omrobjectslot_t *, but we want to replace it with a dedicated type?
Then, I prefer omrobjectslotptr_t, effectively what @gacholio suggested, (but like above, 'f' at the front moved toward the end and got replaced with 'slot').
This potential renaming is optional and orthogonal to @gacholio changes. He should proceed with his and then we may or may not decide to rename to (or somehow differently): fomrobject_t -> omrobjectslot_t fomrobjectptr_t -> omrobjectslotptr_t
This potential renaming is optional and orthogonal to @gacholio changes.
Yes I agree.
I'm not quite following what omrobjecttoken_t would be: is it omrobjectslot_t *, but we want to replace it with a dedicated type?
It would be a type thats wide enough to hold an object pointer that has been loaded, but not yet decoded. In places where we stack-allocate a j9object_t, we would instead use this type. After looking at the barrier code in J9, I very much doubt it would be useful.
For now, is it enough to simply change the example's definition of fomrobject_t
to an incomplete struct?
So far I am using omrobjecttoken_t
for one place in the GC, and I could probably get around the issue. There are probably more places that might be appropriate, but we tend to use uintptr_t
for this.
omrobjecttoken_t fixupSlot = GC_SlotObject::convertTokenFromPointer(omrVM, reverseForwardedHeader.getReverseForwardedPointer());
forwardedHeader->restoreDestroyedOverlap((uint32_t)fixupSlot);
https://github.com/eclipse/omr/pull/5033/files#diff-6316fd673edb008a72e33e15e6b18065R211
Was there any discussion of this at the tagged meeting?
No discussion really. Filip mentioned that he thinks there are (many?) instances of fj9object_t
in the Java compiler.
If there's an interest in downstream projects working in mixed mode, I think this (or something) is essential - far better to find problems at compile time than debug the runtime for days.
As the current proposal is a (relatively) simple search and replace, I can't see why having lots of instances would matter.
Would omr_opqaue_fomrobject_t
and fomrobjectptr_t
be glue types (defined by the application), like fomrobject_t
and omrobjectptr_t
currently are? Maybe fomrobjectptr_t
should be defined by the OMR GC.
I think OMR should define them, not the application.
And please tell me you're not considering those horrid names - I already hate the _t
convention - don't make me type 3 _
in a type name...
Background
fomrobject_t
represents the contents of an object-to-object reference slot. In the current single-mode builds, it is typedeffed to eitheruint32_t
(compressed pointers) oruintptr_t
(full pointers).Prior to the runtime compressed refs work, it was legal to have pointers to these slots, perform arithmetic on those pointers, and load from or store to the pointer. It is now no longer legal to perform these operations directly -
GC_SlotObject
provides an interface to abstract these operations.Proposal
Replace
fomrobject_t*
withfomrobjectptr_t
, which will (eventually) be a pointer to an opaque type, which will disallow pointer arithmetic and dereferencing without an explicit cast to the correctly-sized type.This will prevent any new errors being introduced that will only show up in mixed mode.
Step 1
At this point, the types are equivalent, so downstream projects will compile with or without making any changes.
Step 2
Replace all instances of
fomrobject_t *
withfomrobjectptr_t
in OMR and downstream projects.Step 3
At this point, any downstream project that has not made the transition to
fomrobjectptr_t
will no longer compile.