Closed asfimport closed 12 years ago
Dawid Weiss (@dweiss) (migrated from JIRA)
Geez Dawid, you took away all the reasons I originally opened the issue for
This is by no means wasted time. I think the improvements are clear?
Die, GIT, die!
I disagree here – git is a great tool, even if the learning curve may be steep at first. git-svn is a whole different story (it's a great hack but just a hack).
Uwe Schindler (@uschindler) (migrated from JIRA)
I disagree here
Calm down, was just my well-known standard answer :-)
Dawid Weiss (@dweiss) (migrated from JIRA)
Oh, I am calm, I just know people do hate git (and I used to as well, until I started using it frequently). Robert has a strong opinion about git, for example.
Besides, there's nothing wrong in having a strong opinion – it's great people can choose what they like and still collaborate via patches (and this seems to be the common ground between all vcs's).
Shai Erera (@shaie) (migrated from JIRA)
This is by no means wasted time. I think the improvements are clear?
Yes, yes. It was a joke.
Ok so can I proceed with the commit, or does someone intend to review the patch later?
Uwe Schindler (@uschindler) (migrated from JIRA)
With unsafe we also get all those information like size of array header we have hardcoded. Should we not try to get these in the same way like I did for bitness and reference size - using Unsafe.theUnsafe.arrayBaseOffset()? And fallback to our hardcoded defaults?
Dawid Weiss (@dweiss) (migrated from JIRA)
using Unsafe.theUnsafe.arrayBaseOffset()? And fallback to our hardcoded defaults?
+1.
I will also try on OpenJDK with various jits but I'll do it in the evening.
Yes, yes. It was a joke.
Joke or no joke the truth is I did complain a lot. :)
Dawid Weiss (@dweiss) (migrated from JIRA)
I just peeked at OpenJDK sources and addressSize() is defined as this:
// See comment at file start about UNSAFE_LEAF
//UNSAFE_LEAF(jint, Unsafe_AddressSize())
UNSAFE_ENTRY(jint, Unsafe_AddressSize(JNIEnv *env, jobject unsafe))
UnsafeWrapper("Unsafe_AddressSize");
return sizeof(void*);
UNSAFE_END
In this light this switch:
switch (addressSize) {
case 4:
is64Bit = Boolean.FALSE;
break;
case 8:
is64Bit = Boolean.TRUE;
break;
}
Becomes interesting. Do you know of any architecture with pointers different than 4 or 8 bytes? :)
Dawid Weiss (@dweiss) (migrated from JIRA)
A few more exotic jits from OpenJDK (all seem to be using explicit 8 byte ref size on 64-bit:
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-jamvm"
[junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-jamvm -XX:+UseCompressedOops"
[junit] JVM: OpenJDK Runtime Environment, JamVM, Robert Lougher, 1.6.0-devel, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-cacao"
[junit] JVM: OpenJDK Runtime Environment, CACAO, CACAOVM - Verein zur Foerderung der freien virtuellen Maschine CACAO, 1.1.0pre2, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-server"
[junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 4
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-server -XX:-UseCompressedOops"
[junit] JVM: OpenJDK Runtime Environment, OpenJDK 64-Bit Server VM, Sun Microsystems Inc., 20.0-b11, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_23, Sun Microsystems Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
Dawid Weiss (@dweiss) (migrated from JIRA)
Mac:
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true
[junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_29, Apple Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 4
> ant test-core -Dtestcase=TestRam* -Dtests.verbose=true "-Dargs=-server -XX:-UseCompressedOops"
[junit] JVM: Java(TM) SE Runtime Environment, Java HotSpot(TM) 64-Bit Server VM, Apple Inc., 20.4-b02-402, Java Virtual Machine Specification, Sun Microsystems Inc., 1.6.0_29, Apple Inc., null,
[junit] NOTE: This JVM is 64bit: true
[junit] NOTE: Reference size in this JVM: 8
Mark Miller (@markrmiller) (migrated from JIRA)
Nooo!!![ My eyes]( My eyes)!!![ I'm pretty sure my liver has just been virally licensed]( I'm pretty sure my liver has just been virally licensed)
Dawid Weiss (@dweiss) (migrated from JIRA)
Ok, right, sorry, let me scramble for intellectual property protection reasons:
// See cemnmot at flie sratt abuot U_ANEESAFLF
/
/ ULAAFEN_SEF (jnit, UfdAsnerS_zsiaedse ())
UEATERSNFN_Y (jint, UnidsdserSAasfe_ze (JNnEIv * env, jcbjeot unfsae))
UesWrpfapaner (" UdenfsSseAazs_drie ");
rreutn seiozf (void *
;)
UNEF_SNEAD
Uwe Schindler (@uschindler) (migrated from JIRA)
Becomes interesting. Do you know of any architecture with pointers different than 4 or 8 bytes?
When I was writing that code, I was thinking a very long time about: Hm, should I add a "default" case saying:
default:
throw new Error("Lucene does not like architectures with pointer size " + addressSize)
But then I decided: If there is an architecture with a pointer size of 6, does this break Lucene really? Hm, maybe I should have added a comment there:
default:
// this is the philosophical case of Lucene reaching an architecture returning something different here
Uwe Schindler (@uschindler) (migrated from JIRA)
Maybe this for @UweSays
:
default:
throw new Error("Your processor(*) hit me with his " + addressSize + " inch dick");
// (*)Dawid
Dawid Weiss (@dweiss) (migrated from JIRA)
I would throw an exception just so that we can hear about those architectures nobody has ever heard of ;)
Dawid Weiss (@dweiss) (migrated from JIRA)
Uwe Schindler (@uschindler) (migrated from JIRA)
http://en.wikipedia.org/wiki/Quadruple_precision_floating-point_format
Dawid Weiss (@dweiss) (migrated from JIRA)
Yep, but I'm talking about address registers and addressing in general. 48 bit addressing aligning would be inconvenient if you take into account that any index scaling addressing modes would have to do a shift and an addition (*3) instead of just a shift. Interesting stuff.
Uwe Schindler (@uschindler) (migrated from JIRA)
I agree, was just a joke. The comment before was more about suddenly appearing 128 bit architectures. That ones would have an addressSize of 16, still a power of 2 :-)
I will now look into the unsafe array offsets...
Dawid Weiss (@dweiss) (migrated from JIRA)
Nice. All of a sudden you could enumerate all the atoms in the universe :) I love Wolfram Alpha... http://www.wolframalpha.com/input/?i=is+number+of+atoms+in+the+universe+greater+than+2%5E128%3F
Uwe Schindler (@uschindler) (migrated from JIRA)
I played around: Unsafe.arrayBaseOffset always returns 16 on my 64bit JVM, so it seems that NUM_BYTES_ARRAY_HEADER is wrong in our case (we have it as 12). It seems that the JVM aligns the array data to be multiple of 8 bytes on 64 bit machines?
For normal objects, is there a way with unsafe to get the NUM_BYTES_OBJECT_HEADER?
Dawid Weiss (@dweiss) (migrated from JIRA)
For normal objects, is there a way with unsafe to get the NUM_BYTES_OBJECT_HEADER?
I don't know and I don't know if it varies between vendors. As for aligning – I bet this holds for anything, not only arrays. So fields of an object will be reordered and packed on their own boundary but entire themselves will be aligned on machine word boundaries for efficiency. Did you try running with Instrumentation (an agent)? What does it say about object/ array sizes?
Uwe Schindler (@uschindler) (migrated from JIRA)
Interestingly the ARRAY header seems to be much bigger on 64 bit platforms without compact refs, so I have the feeling that somehow thre is still some space needed for an object ref, so the original definition of the size was more correct? https://gist.github.com/2038305
Using the original definition:
public final static int NUM_BYTES_ARRAY_HEADER = NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF;
This looks much more like the above size, aligned to 8 bytes.
Did you try running with Instrumentation (an agent)? What does it say about object/ array sizes?
Have to try out and set this up first.
Uwe Schindler (@uschindler) (migrated from JIRA)
This is also in line with this instumentation page: http://www.javaspecialists.eu/archive/Issue142.html
Which prints:
measureSize(new byte[1000]);
byte[], shallow=1016, deep=1016
measureSize(new boolean[1000]);
boolean[], shallow=1016, deep=1016
Dawid Weiss (@dweiss) (migrated from JIRA)
Array header is still 12 bytes but it is aligned to the next multiple-8 boundary? Looks like it.
Uwe Schindler (@uschindler) (migrated from JIRA)
But how does that explain that with non-compact refs the arrayBaseOffset is 24?
Dawid Weiss (@dweiss) (migrated from JIRA)
Can you check what size does Object[] report vs. for example Integer[]? I think the difference may be because typed arrays need to know the type of their component.
Dawid Weiss (@dweiss) (migrated from JIRA)
We peeked at the forbidden a bit again. The difference 12 vs. 16 bytes is a result of how ordinary object pointers (OOPs) are defined – they are a combination of object header information (oopMark) and class pointer. The class pointer is a compile time union of either a regular pointer or a compact pointer. oopMark is either 4 bytes (32 bit jvms) or 8 bytes (64 bit jvms). So:
64 bit jvm, full oops: 8 + 8 = 16 64 bit jvm, compact oops: 8 + 4 = 12 32 bit jvm: 4 + 4 = 8
Uwe Schindler (@uschindler) (migrated from JIRA)
With the help of Dawid (inspecting forbidden C code g), we checked the actual size and how they are calculated. Based on that I changed the defaults depending on bitness (Object header is 16 on 64 bit without compact refs, array header is 24 on 64 bit).
The attached patch will use the above defaults, but tries to update them using sun.misc.Unsafe. The trick to get the object header from usafe is by declaring a dummy class extending Object with one single field. We are then using unsafe to get the fieldOffset of that field. As Dawid pointed out, the return value is identical to his investigations (8 bytes on 32 bit archs, 16 bytes on 64 bit archs and 12 bytes on compact ref 64bit archs). So RamUsageEstimator was completely wrong in the past for 64 bit architectures.
I also changed the funny switch statement ("the " + adressSize + " inch dick") to assume 64 bits architecture, if addressSize >= 8.
Uwe Schindler (@uschindler) (migrated from JIRA)
I would like to remove the AverageBlabla memotry model. The code inside is simply no longer useful. RamUsageEstimator simply uses the sizes returned by the JVM.
Uwe Schindler (@uschindler) (migrated from JIRA)
Updated patch with the abstract and now useless MemoryModel removed.
Uwe Schindler (@uschindler) (migrated from JIRA)
Robert reminded me that there is also a heavily broken custom memory estimator in MemoryIndex, too. I will look into it, too.
Uwe Schindler (@uschindler) (migrated from JIRA)
Attached is a patch fixing several bugs and more:
What's the reason why this rounding up to 8 bytes was added? I assume this information comes from somewhere, but it was added by Shai without any explanation. Is this not also dependent on the 64bitness if its 8 or 4?
Otherwise patch is ready.
Robert Muir (@rmuir) (migrated from JIRA)
Removed the MemoryIndex VM class and the completely outdated and incorrect estimation there.
thank you!!!
Dawid Weiss (@dweiss) (migrated from JIRA)
Awesome job, Uwe. I think I wasn't right about that alignment of arrays – sizeof(int) should't come up to 8. I will look into this again in the evening, it got me interested. I'll also check out the alignments, so if this patch can wait until tomorrow then we'll be more confident we get the estimates right.
Dawid Weiss (@dweiss) (migrated from JIRA)
This is very interesting indeed.
So, I used the agent hook into a running VM to dump some of the internal diagnostics, including OOP sizes, heap word alignments, etc. Here's a scoop of the results (with client-side indicated sizes on the right):
# 1.7, 64 bit, OOPS compressed (client)
getOopSize: 8 ref size = 4
Address size: 8 array header = 16
Bytes per long: 8 object header = 12
CPU: amd64
HeapOopSize: 4
HeapWordSize: 8
IntSize: 4
getMinObjAlignmentInBytes: 8
getObjectAlignmentInBytes: 8
isCompressedOopsEnabled: true
isLP64: true
# 1.7, 64 bit, full
getOopSize: 8 ref size = 8
Address size: 8 array header = 24
Bytes per long: 8 object header = 16
CPU: amd64
HeapOopSize: 8
HeapWordSize: 8
IntSize: 4
getMinObjAlignmentInBytes: 8
getObjectAlignmentInBytes: 8
isCompressedOopsEnabled: false
isLP64: true
# 1.7, 32 bit
getOopSize: 4 ref size = 4
Address size: 4 array header = 12
Bytes per long: 8 object header = 8
CPU: x86
HeapOopSize: 4
HeapWordSize: 4
IntSize: 4
getMinObjAlignmentInBytes: 8
getObjectAlignmentInBytes: 8
isCompressedOopsEnabled: false
isLP64: false
The question we asked ourselves with Uwe is why an empty array takes 24 bytes without OOP compression (that's object overhead and an int length, so should be 16 + 4 = 20)? The answer seems to be in how base offsets are calculated for arrays – they seem to be enforced on HeapWordSize boundary and this is 8, even with OOP compressed:
// Returns the offset of the first element.
static int base_offset_in_bytes(BasicType type) {
return header_size(type) * HeapWordSize;
}
I'll spare you the detailed code but the rounding to next HeapWordSize multiple seems evident in all cases. What's even more interesting, this "wasted" space is not (and cannot) be used for data so even a single integer pushes the array size to the next available bound:
int[0] = 24
int[1] = 32 ⭐
int[2] = 32
int[3] = 40
Finally, I could not resist to mention that object alignments... are adjustable, at least to 2^n boundaries. So you can also do this:
> java -XX:-UseCompressedOops -XX:ObjectAlignmentInBytes=32 ...
Object = 32
int[0] = 32
int[1] = 32
int[2] = 32
int[3] = 64
Nice, huh? :) I don't think the JVM has been tested heavily for this possibility though because the code hung on me a few times if executed in that mode.
Dawid Weiss (@dweiss) (migrated from JIRA)
He, he, he... this is fun, haven't been playing with Unsafe for a while and forgot how enjoyable this can be.
Unsafe us = ...;
byte [] dummy = {0x11, 0x22, 0x33, 0x44};
int [] dummy2 = {0}; // match length above.
// Change the class of dummy to int[]...
int klazz = us.getInt(dummy2, 8);
us.putInt( dummy, 8, klazz);
// this will be ok.
dummy2 = (int[])(Object) dummy;
// and we can run native int accessors on a byte[] now...
System.out.println("> " + Integer.toHexString(dummy2[0]));
Dawid Weiss (@dweiss) (migrated from JIRA)
I think Yonik once mentioned he wanted a fast hash over byte[] – this could be it (temporarily cast to a long[] and then revert after computations are over). Go for it, Yonik :)
Uwe Schindler (@uschindler) (migrated from JIRA)
Thanks for investigation. The 8 byte object size multiplier is fixed, so the round-up method is fine.
I have been thinking about alignment things. Its a good possibility to get the object size by suming up the field sizes, but it can even be done better.
If unsafe is available and useable, we can simply get the object size (including all headers), by finding the Math.max(field offset + field type length). So the object size is the offset of the last field (with biggest offset) + its size. This value is finally rounded up to multiples of 8.
The attached patch does this.
Dawid Weiss (@dweiss) (migrated from JIRA)
The 8 byte object size multiplier is fixed, so the round-up method is fine.
I don't think it's "fixed" – see the -XX:ObjectAlignmentInBytes=32 above. But the defaults seem to be the same on all systems.
Uwe Schindler (@uschindler) (migrated from JIRA)
I don't think it's "fixed" – see the -XX:ObjectAlignmentInBytes=32 above. But the defaults seem to be the same on all systems.
I would like to have the rounding also dynamic, but this is not possible to find out with Unsafe, at least for this I have no idea :(
Dawid Weiss (@dweiss) (migrated from JIRA)
at least for this I have no idea
The management factory trick mentioned by Kris works for object alignment as well:
package spikes;
import java.io.IOException;
import java.lang.management.ManagementFactory;
import java.lang.reflect.Method;
import java.util.List;
import com.sun.management.HotSpotDiagnosticMXBean;
import com.sun.management.VMOption;
public class ObAlignment
{
private static final String HOTSPOT_BEAN_NAME = "com.sun.management:type=HotSpotDiagnostic";
private static HotSpotDiagnosticMXBean hotspotMBean;
private static HotSpotDiagnosticMXBean getHotSpotMBean() {
if (hotspotMBean == null) {
try {
hotspotMBean = ManagementFactory.newPlatformMXBeanProxy(
ManagementFactory.getPlatformMBeanServer(),
HOTSPOT_BEAN_NAME,
HotSpotDiagnosticMXBean.class);
} catch (IOException e) {
e.printStackTrace();
}
}
return hotspotMBean;
}
public static void main(String [] args)
throws Exception
{
// Just the object alignment.
System.out.println(getHotSpotMBean().getVMOption("ObjectAlignmentInBytes"));
// Everything.
Class<?> fc = Class.forName("sun.management.Flag");
System.out.println(fc);
Method m = fc.getDeclaredMethod("getAllFlags");
m.setAccessible(true);
List<Object> flags = (List<Object>) m.invoke(null);
for (Object f : flags) {
Method dm = f.getClass().getDeclaredMethod("getVMOption");
dm.setAccessible(true);
VMOption option = (VMOption) dm.invoke(f);
System.out.println(option);
}
}
}
I don't think it is of much practical use for now (object alignment seems to be constant everywhere), but we could as well probe it – if it's available why not use it.
I'd also like to add a shallow size method (which wouldn't follow the fields, just return the aligned object size). I'll be able to work on it in the evening though, not sooner.
Uwe Schindler (@uschindler) (migrated from JIRA)
Too funny, I had the same idea while at breakfast and started to implement it when you were writing your comment :-)
I will post patch soon (also with other improvements)!
Uwe Schindler (@uschindler) (migrated from JIRA)
I will add a shallow parameter to the estimate method, we just dont have to dig into, so it's a simple if check.
Uwe Schindler (@uschindler) (migrated from JIRA)
New patch:
Uwe Schindler (@uschindler) (migrated from JIRA)
Minor improvements.
Dawid Weiss (@dweiss) (migrated from JIRA)
The patch looks good. I don't know if decorating IdentityHashMap to be a set adds any overhead... I was also thinking about doing a custom set impl. for this so that we know how much memory we allocate during the checking itself, but it seems to be very specific to what I need, so no worries.
One thing:
(size % NUM_BYTES_OBJECT_ALIGNMENT);
Byte alignment will be a power of 2 (that option to change it even enforces it when you start the JVM) so you can do a bitmask instead of modulo - should be slighly faster.
Uwe Schindler (@uschindler) (migrated from JIRA)
I separated the shallow Object inspection to a static method, which is more cheap (no RamUsageEstimator instance is needed). The static method now only takes a Class<?> parameter and returns the size (an instance is not even needed).
I also added a diagnostic boolean, so you can query RamUsageEstimator, if the used JVM is supported (supports Hotspot diagnostics, sum.misc.Unsafe). If that is not the case, our testcase will print a warning so users cam report back (if they run the tests).
I think this is ready to commit.
Uwe Schindler (@uschindler) (migrated from JIRA)
One thing:
(size % NUM_BYTES_OBJECT_ALIGNMENT);
Byte alignment will be a power of 2 (that option to change it even enforces it when you start the JVM) so you can do a bitmask instead of modulo - should be slighly faster.
I don't think thats really needed here :-) Speed is limited by reflection in most cases and this one calculation should not matter. Also the number is not reported back as power of 2, so I have to calc the log2 first (ok, ntz &co.), but I don't think we should actually limit that to powers of 2. Maybe another vendor has the ultimate answer of 42 for his objects?
Uwe Schindler (@uschindler) (migrated from JIRA)
I don't know if decorating IdentityHashMap to be a set adds any overhead
The whole problem is more that it might happen that the IdentityHashMap takes horrible amounts of memory while inspecting (think of a boxed numbers array like Byte[50000]). Speed is not important, reflection is slow :(
I have no better idea about how to detect duplicates, unfortunately. The old trick from Arrays.deepEquals() to stop when the parameter itsself is seen again, is not enough here.
Uwe Schindler (@uschindler) (migrated from JIRA)
One more improvement: The shallow Class inspection can ignore superclasses, if Unsafe is in use. As additional fields are always added at the end (otherwise casting of classes and later field access would not work inside the JVM), to find the maximum field offset we don't need to go to superclasses.
I want to commit and backport this to 3.x during the weekend.
RamUsageEstimator.NUM_BYTES_ARRAY_HEADER is computed like that: NUM_BYTES_OBJECT_HEADER + NUM_BYTES_INT + NUM_BYTES_OBJECT_REF. The NUM_BYTES_OBJECT_REF part should not be included, at least not according to this page: http://www.javamex.com/tutorials/memory/array_memory_usage.shtml
While on it, I wrote a sizeOf(String) impl, and I wonder how do people feel about including such helper methods in RUE, as static, stateless, methods? It's not perfect, there's some room for improvement I'm sure, here it is:
If people are not against it, I'd like to also add sizeOf(int[] / byte[] / long[] / double[] ... and String[]).
Migrated from LUCENE-3867 by Shai Erera (@shaie), resolved Mar 23 2012 Attachments: LUCENE-3867.patch (versions: 19), LUCENE-3867-3.x.patch, LUCENE-3867-compressedOops.patch