peter-lawrey / Java-Thread-Affinity

Control thread affinity for Java
379 stars 77 forks source link

ArrayIndexOutOfBoundsException during #19

Closed akerbos closed 12 years ago

akerbos commented 12 years ago

Under certain circumstances, AffinityStrategy#matches is called with -1 in one of the parameters. This causes cpuLayout.socketId to fail.

When this occurred, the native library was missing and I did not have ANY as last strategy. I do not know wether this is of import.

From looking in the code, it seems as if -1 is used to encode "no suitable core found". It is then used as "last assigned core" which is compared with all possible new cores, leading to the error.

Possible fix:

peter-lawrey commented 12 years ago

This should be fixed in the latest update (last night) Can you update and try the last code?

akerbos commented 12 years ago

I keep getting them.

peter-lawrey commented 12 years ago

Can you include a stack trace or the error message you get?

peter-lawrey commented 12 years ago

I have added a test which attempts to create far more locks than you can have.

AffinityLock al = AffinityLock.acquireLock();
List<AffinityLock> locks = new ArrayList<AffinityLock>();
locks.add(al);
for(int i=0;i<256;i++)
    locks.add(al = al.acquireLock(AffinityStrategies.DIFFERENT_SOCKET,
            AffinityStrategies.DIFFERENT_CORE,
            AffinityStrategies.SAME_SOCKET,
            AffinityStrategies.ANY));
for (AffinityLock lock : locks) {
    lock.release();
}
akerbos commented 12 years ago

I create as many threads as I have cores (two, one thread each). This occurrs (only) every few hundred (or so) test runs of my code:

Exception in thread "MyClass.Worker-2" java.lang.ArrayIndexOutOfBoundsException: -1
    at java.util.ArrayList.get(ArrayList.java:324)
    at vanilla.java.affinity.impl.VanillaCpuLayout.socketId(VanillaCpuLayout.java:155)
    at my.code.Util$1.matches(Util.java:49)
    at vanilla.java.affinity.AffinityLock.acquireLock(AffinityLock.java:170)
    at vanilla.java.affinity.AffinityLock.acquireLock(AffinityLock.java:318)
    at vanilla.java.affinity.AffinityThreadFactory$1.run(AffinityThreadFactory.java:52)
    at java.lang.Thread.run(Thread.java:662)

This is the factory use:

new AffinityThreadFactory(this + ".Worker", false, Util.SAME_SOCKET_DIFFERENT_CORE, DIFFERENT_CORE, ANY);

where Util.[...] is of course the piece referenced in the stack trace:

public static final AffinityStrategy SAME_SOCKET_DIFFERENT_CORE = new AffinityStrategy() {
  @Override
  public boolean matches(int cpuId, int cpuId2) {
    CpuLayout cpuLayout = AffinityLock.cpuLayout();
    return    cpuLayout.socketId(cpuId) == cpuLayout.socketId(cpuId2) // Exception thrown here
            && cpuLayout.coreId(cpuId) != cpuLayout.coreId(cpuId2);
  }
};

My workaround is to add (cpuId < 0 || cpuId2 < 0) && [...] to matches.

peter-lawrey commented 12 years ago

I can see you are using the AffinityThreadFactory and you are trying to create more threads than you have reservable cores.

What is happening is, the last lock failed to find a free cpu AND there is a free cpu now.

I have changed the behaviour so any free cpu is chosen after binding on a failed lock.

akerbos commented 12 years ago

Hm. This may have been the reason (creating two threads on a two-core machine) but why do hundreds of succeed until one fails? Hope you got it fixed now. :)

peter-lawrey commented 12 years ago

Your threads could be very short lived for some reason. Even though you are creating hundreds of tasks this may only result in a couple of threads being created.

akerbos commented 12 years ago

I definitely call AffinityThreadFactory#newThread hundreds of times. I do not use a thread pool in my tests but call it directly so I know that. I never create/bind more threads than there are cores on the machine at any given time, though.

peter-lawrey commented 12 years ago

I suspect calling newThread too often will result in a significant overhead. You are better off allocating threads fairly staticly.

akerbos commented 12 years ago

It is a unit test, so I do not worry about efficiency. I do worry about erratically occurring AIOOBExceptions, though.

peter-lawrey commented 12 years ago

That should be corrected. If it still happens can you add a stack trace.

akerbos commented 12 years ago

Can not reproduce using 1.5.4-SNAPSHOT from March 6th, 11:30 CET.

peter-lawrey commented 12 years ago

Thank you for all your feedback.