beehive-lab / TornadoVM

TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
https://www.tornadovm.org
Apache License 2.0
1.17k stars 110 forks source link

exceptions when using @Parallel & @Reduce #331

Open gaoyang-li opened 6 months ago

gaoyang-li commented 6 months ago

Describe the bug

Exception in thread "main" java.lang.ArithmeticException: / by zero
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskGraph.calculateAcceleratorGroupSize(ReduceTaskGraph.java:168)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskGraph.obtainSizeArrayResult(ReduceTaskGraph.java:135)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskGraph.scheduleWithReduction(ReduceTaskGraph.java:476)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.rewriteTaskForReduceSkeleton(TornadoTaskGraph.java:1171)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.reduceAnalysis(TornadoTaskGraph.java:1181)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.analyzeSkeletonAndRun(TornadoTaskGraph.java:1191)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.schedule(TornadoTaskGraph.java:1278)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TaskGraph.execute(TaskGraph.java:767)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.ImmutableTaskGraph.execute(ImmutableTaskGraph.java:49)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan$TornadoExecutor.execute(TornadoExecutionPlan.java:295)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan.execute(TornadoExecutionPlan.java:97)
    at tornado.examples@1.0.2-dev/uk.ac.manchester.tornado.examples.rodinia.kmeanstest.TestCase.main(TestCase.java:138)

How To Reproduce

public static void parallel4(IntArray input, @Reduce IntArray data){
        for (@Parallel int i = 0; i < input.getSize(); i++){
            data.set(0, data.get(0) + input.get(i));
        }
}
public static void parallel5(IntArray input, @Reduce IntArray data){
        for (@Parallel int i = 0; i < input.getSize(); i++){
            data.set(0, data.get(0) + 1);
        }
}
public static void parallel6(Matrix2DDouble matrix, IntArray input, @Reduce IntArray data){  // divide by zero error
        for (@Parallel int i = 0; i < input.getSize(); i++){
            data.set(0, data.get(0) + input.get(i));
        }
}
public static void main(String[] args){
        IntArray data = new IntArray(10);
        IntArray input = new IntArray(10);
        Matrix2DDouble matrix = new Matrix2DDouble(10, 10);
        data.init(0);
        input.init(0);

        TornadoDevice device = TornadoRuntime.getTornadoRuntime().getDefaultDevice();
        TaskGraph taskGraph4 = new TaskGraph("s4")
               .transferToDevice(DataTransferMode.EVERY_EXECUTION, input, data)
                .task("t4", TestCase::parallel4, input, data)
                .transferToHost(DataTransferMode.EVERY_EXECUTION, data);
        ImmutableTaskGraph immutableTaskGraph4 = taskGraph4.snapshot();
        TornadoExecutionPlan executor4 = new TornadoExecutionPlan(immutableTaskGraph4)
                .withDevice(device);
         executor4.execute();

        TornadoDevice device = TornadoRuntime.getTornadoRuntime().getDefaultDevice();
        TaskGraph taskGraph5 = new TaskGraph("s5")
                .transferToDevice(DataTransferMode.EVERY_EXECUTION, input, data)
                .task("t5", TestCase::parallel5, input, data)
                .transferToHost(DataTransferMode.EVERY_EXECUTION, data);
        ImmutableTaskGraph immutableTaskGraph5 = taskGraph5.snapshot();
        TornadoExecutionPlan executor5 = new TornadoExecutionPlan(immutableTaskGraph5)
                .withDevice(device);
        executor5.execute();

        TornadoDevice device = TornadoRuntime.getTornadoRuntime().getDefaultDevice();
        TaskGraph taskGraph6 = new TaskGraph("s6")
                .transferToDevice(DataTransferMode.EVERY_EXECUTION, matrix, input, data)
                .task("t6", TestCase::parallel6, matrix, input, data)
                .transferToHost(DataTransferMode.EVERY_EXECUTION, data);
        ImmutableTaskGraph immutableTaskGraph6 = taskGraph6.snapshot();
        TornadoExecutionPlan executor6 = new TornadoExecutionPlan(immutableTaskGraph6)
                .withDevice(device);
        executor6.execute();
}

Expected behavior

It works when Matrix2DDouble matrix is not a parameter (method parallel4). I want to make Matrix2DDouble matrix as a parameter to modify Matrix2DDouble matrix inside the parallel for loop (method parallel6), but there is an error (as shown in the describing the bug part). The reason I use Matrix2DDouble is to replace double[][].

Computing system setup (please complete the following information):

Additional context

if I want to add a constant number (method parallel5). Must I create IntArray a = new IntArray(1); a.set(0,1) and then a.get(0) to replace the constant number? Here is the exception when I run the execution of parallel5.

Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoInternalError: unimplemented
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.exceptions.TornadoInternalError.unimplemented(TornadoInternalError.java:29)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.graal.compiler.TornadoSnippetReflectionProvider.forBoxed(TornadoSnippetReflectionProvider.java:43)
    at jdk.internal.vm.compiler/org.graalvm.compiler.replacements.SnippetTemplate.forBoxed(SnippetTemplate.java:1711)
    at jdk.internal.vm.compiler/org.graalvm.compiler.replacements.SnippetTemplate.bind(SnippetTemplate.java:1661)
    at jdk.internal.vm.compiler/org.graalvm.compiler.replacements.SnippetTemplate.instantiate(SnippetTemplate.java:2039)
    at jdk.internal.vm.compiler/org.graalvm.compiler.replacements.SnippetTemplate.instantiate(SnippetTemplate.java:2002)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.snippets.PTXGPUReduceSnippets$Templates.lower(PTXGPUReduceSnippets.java:1040)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.PTXLoweringProvider.lowerReduceSnippets(PTXLoweringProvider.java:380)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.PTXLoweringProvider.lower(PTXLoweringProvider.java:251)
    at jdk.internal.vm.compiler/org.graalvm.compiler.nodes.spi.Lowerable.lower(Lowerable.java:40)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase.process(LoweringPhase.java:665)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase$ProcessFrame.preprocess(LoweringPhase.java:553)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase.processBlock(LoweringPhase.java:764)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase.lower(LoweringPhase.java:297)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase.run(LoweringPhase.java:271)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.common.LoweringPhase.run(LoweringPhase.java:113)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:434)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:322)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.PhaseSuite.run(PhaseSuite.java:390)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:434)
    at jdk.internal.vm.compiler/org.graalvm.compiler.phases.BasePhase.apply(BasePhase.java:322)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.compiler.PTXCompiler.emitFrontEnd(PTXCompiler.java:235)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.compiler.PTXCompiler.compile(PTXCompiler.java:113)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.compiler.PTXCompiler$PTXCompilationRequest.execute(PTXCompiler.java:519)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.graal.compiler.PTXCompiler.compileSketchForDevice(PTXCompiler.java:297)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.runtime.PTXTornadoDevice.compileTask(PTXTornadoDevice.java:183)
    at tornado.drivers.ptx@1.0.2-dev/uk.ac.manchester.tornado.drivers.ptx.runtime.PTXTornadoDevice.installCode(PTXTornadoDevice.java:154)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.interpreter.TornadoVMInterpreter.compileTaskFromBytecodeToBinary(TornadoVMInterpreter.java:621)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.interpreter.TornadoVMInterpreter.execute(TornadoVMInterpreter.java:325)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.interpreter.TornadoVMInterpreter.execute(TornadoVMInterpreter.java:856)
    at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:1024)
    at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.TornadoVM.executeSingleThreaded(TornadoVM.java:118)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.TornadoVM.execute(TornadoVM.java:107)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.scheduleInner(TornadoTaskGraph.java:830)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.schedule(TornadoTaskGraph.java:1297)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TaskGraph.execute(TaskGraph.java:767)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.ImmutableTaskGraph.execute(ImmutableTaskGraph.java:49)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan$TornadoExecutor.execute(TornadoExecutionPlan.java:295)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan.execute(TornadoExecutionPlan.java:97)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskGraph.executeExpression(ReduceTaskGraph.java:651)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.ReduceTaskGraph.scheduleWithReduction(ReduceTaskGraph.java:601)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.rewriteTaskForReduceSkeleton(TornadoTaskGraph.java:1171)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.reduceAnalysis(TornadoTaskGraph.java:1181)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.analyzeSkeletonAndRun(TornadoTaskGraph.java:1191)
    at tornado.runtime@1.0.2-dev/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.schedule(TornadoTaskGraph.java:1278)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TaskGraph.execute(TaskGraph.java:767)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.ImmutableTaskGraph.execute(ImmutableTaskGraph.java:49)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan$TornadoExecutor.execute(TornadoExecutionPlan.java:295)
    at tornado.api@1.0.2-dev/uk.ac.manchester.tornado.api.TornadoExecutionPlan.execute(TornadoExecutionPlan.java:97)
    at tornado.examples@1.0.2-dev/uk.ac.manchester.tornado.examples.rodinia.kmeanstest.TestCase.main(TestCase.java:124) 
jjfumero commented 6 months ago

Thank you @Gaoyang123456 for the detailed report. This is clearly a bug. We will take a look.