oracle / graaljs

GraalJS – A high-performance, ECMAScript compliant, and embeddable JavaScript runtime for Java
https://www.graalvm.org/javascript/
Universal Permissive License v1.0
1.81k stars 191 forks source link

Muiltithreading is not allowed #30

Closed davidpcaldwell closed 5 years ago

davidpcaldwell commented 6 years ago

Graal (1.0.0-rc4) command: js --js.nashorn-compat=true --jvm thread.js

Script (thread.js) var _runnable = new Packages.java.lang.Runnable({ run: function() { print("hello"); } });

var _thread = new Packages.java.lang.Thread(_runnable); _thread.start(); _thread.join();

Graal result: Exception in thread "Thread-3" java.lang.IllegalStateException: Multi threaded access requested by thread Thread[Thread-3,5,main] but is not allowed for language(s) js. at com.oracle.truffle.api.vm.PolyglotContextImpl.throwDeniedThreadAccess(PolyglotContextImpl.java:604) at com.oracle.truffle.api.vm.PolyglotContextImpl.checkAllThreadAccesses(PolyglotContextImpl.java:522) at com.oracle.truffle.api.vm.PolyglotContextImpl.enterThreadChanged(PolyglotContextImpl.java:443) at com.oracle.truffle.api.vm.PolyglotContextImpl.enter(PolyglotContextImpl.java:403) at com.oracle.truffle.api.vm.PolyglotValue$PolyglotNode.execute(PolyglotValue.java:542) at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:262) at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:251) at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:241) at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:226) at org.graalvm.compiler.truffle.runtime.GraalTVMCI.callProfiled(GraalTVMCI.java:86) at com.oracle.truffle.api.impl.Accessor.callProfiled(Accessor.java:733) at com.oracle.truffle.api.vm.VMAccessor.callProfiled(VMAccessor.java:93) at com.oracle.truffle.api.vm.PolyglotValue$Interop.executeVoid(PolyglotValue.java:1405) at org.graalvm.polyglot.Value.executeVoid(Value.java:331) at com.oracle.truffle.js.javaadapters.java.lang.Runnable.run(Unknown Source) at java.lang.Thread.run(Thread.java:748)

Expected result (JVM jjs) hello

wirthi commented 6 years ago

Hi David,

thanks for your request. Not allowing Threads in the way you sketch here is an intentional design decision, as it can lead to unmanagable synchronization issues. You can however create multiple JavaScript engines concurrently, from different Java threads.

Some documentation on multithreading and the difference to Nashorn is in https://github.com/graalvm/graaljs/blob/master/docs/user/NashornMigrationGuide.md#multithreading and it has already been discussed in previous issues like https://github.com/graalvm/graaljs/issues/16

Best, Christian

davidpcaldwell commented 6 years ago

So, it's possible to start threads from JavaScript, but there's no way to guarantee they will ever complete if we can't invoke join(); without that, I assume the VM will just exit when it reaches the end of the launching script.

Does GraalVM have a way to work around this? Do we need to poll using java.lang.Thread methods and enter a busy-wait loop or something?

woess commented 6 years ago

You can start/join threads, but you can't use a JS Context from two threads at the same time.

chumer commented 6 years ago

If we would know that the start is followed by a join immediately we can temporarily transfer control of the JS context to the started thread and then transfer back to the main thread after the join. Currently it is not possible to do that when running with the JavaScript launcher (and I don't really know how to support this in a safe way).

This feature is currently only available for embedders and it works like this:

public class Test {

    public static void main(String[] args) {
        try (Context context = Context.create("js")) {
            context.enter();
            context.getBindings("js").putMember("runInThread", (Consumer<Object>) (a) -> {
                Value toExecute = context.asValue(a);
                context.leave();
                try {
                    Thread thread = new Thread(() -> {
                        context.enter();
                        try {
                            toExecute.execute();
                        } finally {
                            context.leave();
                        }
                    });
                    thread.start();
                    try {
                        thread.join();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                } finally {
                    context.enter();
                }
            });
            context.getBindings("js").putMember("currentThread", (Supplier<Thread>) () -> Thread.currentThread());
            context.eval("js", "" +
                            "var globalVar = 42;" +
                            "print('start from ' + currentThread().toString());" +
                            "runInThread(function() {" +
                                "print('accessing from ' + currentThread().toString());" +
                                "print('accessing globalVar ' + globalVar);" +
                            "});");
            context.leave();
        }
    }
}

This prints for me:

start from Thread[main,5,main]
accessing from Thread[Thread-3,5,main]
accessing globalVar 42
davidpcaldwell commented 6 years ago

The use case here is some Rhino/Nashorn-compatible code that forks a process and then processes and streams stdin, stdout, and stderr. What would be a good way to do this from the js launcher, if any?

(The joins are used to ensure that 1. the invoker doesn't exit before the process ends, 2. the invoker doesn't exit before the process's output has been streamed to its destination.)

chumer commented 6 years ago

We don't have any guarantees that there is no code between start and join that accesses shared objects. This way this pattern is generally unsafe as JavaScript has no defined semantics for accesses from multiple threads at the same time. Nashorn and Rhino ignored this problem by default.

You can implement the Runnable for the thread in java put it on the classpath and run it in the thread instead.

You can also create an inner JavaScript context that consumes stdin, stdout, stderr.

Lastly, we could introduce an unsafe mode where we ignore thread checks.

provegard commented 6 years ago

Not supporting multithreading may be a problem for me as well. I'm currently trying to migrate a big Nashorn application which basically does:

  1. Parse a set of JS files using a Nashorn ScriptEngine.
  2. Publish/expose functions contained in the JS files.
  3. Allow multiple "requests" to call into said functions.

This is not a problem since there is no shared state across threads.

I haven't come this far in my migration, but now I'm worried that this will be a showstopper.

I suppose I could create a thread pool of N threads and parse the JS files on each separate thread, though that would result in some startup overhead. Is this the recommended solution?

eleinadani commented 6 years ago

Hi Per,

Graal.js indeed does support multi-threading. The only limitation is that we do not allow threads to execute JS code using the same JS context at the same time (i.e., concurrently). Running distinct Graal.js contexts in parallel is perfectly fine. Here you can find a few tests that showcase some basic example applications that use Graal.js from multiple threads. If I understand your question correctly, the scenario that you are describing should be similar to this test.

provegard commented 6 years ago

@eleinadani Thanks for the pointers! I think this one is closer to my use case. I will experiment a bit with synchronization and see where I end up.

ArthurStocker commented 5 years ago

HI,

I just played around a little bit with Java.extend and found a mysterious behavior with it.

var URLStreamHandlerFactory = Java.type('java.net.URLStreamHandlerFactory');
var cURLStreamHandlerFactory = Java.extend(URLStreamHandlerFactory, {
    createURLStreamHandler: function(protocol) {
        if (protocol == 'test') {
            var URLStreamHandler = Java.type('java.net.URLStreamHandler');
            var cURLStreamHandler = Java.extend(URLStreamHandler, {
                openConnection: function(Url) {
                    var protocol = Url.getProtocol();

                    //console.log(protocol);

                    var URLConnection = Java.type('java.net.URLConnection');
                    var cURLConnection = Java.extend(URLConnection);

                    var Connection = new cURLConnection(
                        Url, {
                            connect: function() {
                                //console.log(Connection_super);
                                console.log('Connected!');
                                return true;
                            },
                            getInputStream: function() {
                                console.log('new InputStream');
                            }
                        }
                    );
                    var Connection_super = Java.super(Connection);
                    return Connection;
                }
            });

            return new cURLStreamHandler();
        }
        return null;
    }
});

var URL = Java.type('java.net.URL');
URL.setURLStreamHandlerFactory(new cURLStreamHandlerFactory());

var url = new URL('test://test.host/test.file');
var connection = url.openConnection();
connection.connect();

If I run this code with node and js from rc10 it crashes with error caused by concurrent access to same context from multiple threads. node --jvm --polyglot CustomURLStreamHandlerFactory.js is somehow a bit better then js, as it crashes only 1 out of 4 runs. Sometime 1 out of 10. js crashes right away.

java.util.ServiceConfigurationError: org.graalvm.compiler.options.OptionDescriptors: Error reading configuration file
...
Caused by: java.net.MalformedURLException: Multi threaded access requested by thread Thread[JVMCI CompilerThread2,9,system] but is not allowed for language(s) js.
...
Caused by: com.oracle.truffle.polyglot.PolyglotIllegalStateException: Multi threaded access requested by thread Thread[JVMCI CompilerThread2,9,system] but is not allowed for language(s) js.
...
Connected!

I expected Java.extend would run the extension in separate threads, but it looks like it's not.

How do I have to overcome this. Is it even possible with the actual development state ?

psanders commented 5 years ago

If I understand this correctly, one can not use a library that internally uses multithreading directly from GraalJS? I'm not sure if this is related, the following script runs fine with Nashorn but not with GraalJS:

Java.type('spark.Spark').get("/hello", function() { return "Hello World" })
java.lang.Thread.currentThread().join()
//java.lang.Thread.sleep(java.lang.LONG.MAX_VALUE)

Could this be related?

davidpcaldwell commented 5 years ago

You can also create an inner JavaScript context that consumes stdin, stdout, stderr.

How would that look? How would it work? The only way I have noticed so far to create an inner context in JS is by upcalling to Java. But then I still can't figure out how to use that Context in parallel -- if I try to exchange values, I get org.graalvm.polyglot.PolyglotException: java.lang.IllegalArgumentException: The value 'com.oracle.truffle.polyglot.PolyglotLanguageBindings@4b45a2f5' cannot be passed from one context to another. If I try to set up the bindings inside the dedicated thread I get the Multi threaded access ... message.

eleinadani commented 5 years ago

@ArthurStocker @psanders @davidpcaldwell thanks for your comments and questions,

in general, Graal.js does not allow JS values to be accessed concurrently by two or more threads using the same JS Context. The policy is enforced by runtime checks that throw PolyglotIllegalStateException when a violation is detected. This happens for example in the code from @ArthurStocker and @psanders: in both cases, JS functions created in one JS context are called from another thread without leaving the context (e.g., from a thread pool in Java space). To enable concurrent execution, independent, share-nothing contexts should be used, as described in our examples.

Similarly, sharing of objects between different contexts is not allowed. This is why @davidpcaldwell you get an IllegalArgumentException. You can however share Java objects. Here we provide an example where two threads share and access a Java queue in parallel.

davidpcaldwell commented 5 years ago

@eleinadani thank you. I am trying to share Java objects -- trying to create a Java object in one JavaScript Context and then use that Java object in another thread, by (within JS) creating a new Context, using that Context's bindings to specify the Java object as an "argument," and then invoking methods on the Java object accordingly.

Is that possible? So far I haven't been able to achieve it.

davidpcaldwell commented 5 years ago

Here's a script that simplifies and demonstrates what I'm trying to do.

var HashMap = Java.type("java.util.HashMap");
var Runnable = Java.type("java.lang.Runnable");
var System = Java.type("java.lang.System");
var Context = Java.type("org.graalvm.polyglot.Context");

var map = new HashMap();
map.put("foo", "bar");

var context = Context.create("js");
context.getBindings("js").putMember("map", map);
context.getBindings("js").putMember("callback", function() {
    System.err.println("map[bar] = " + map.get("bar"));
});

var runnable = new Runnable({
    run: function() {
        context.eval("js", "map.put('bar', 'baz'); callback();");
    }
});

new Thread(runnable).start();

It fails on line 10, with org.graalvm.polyglot.PolyglotException: java.lang.IllegalArgumentException: The value 'com.oracle.truffle.polyglot.PolyglotLanguageBindings@5fbdfdcf' cannot be passed from one context to another..

Can this approach be made to work? What rule is being violated here?

ArthurStocker commented 5 years ago

@eleinadani am I right that in this case I have to implement the URLStreamHandlerFactory complete in Java as I cannot control the context leave/enter from within node or Java.extend? @davidpcaldwell if I understood correctly, the assignment of your Hashmap to map converts the map variable to a polyglot value. This means you have to convert that back to a Java object before you can use it in the bindings. If my assumption is right, you may have a similar issue like I have. You cannot control this conversion from within node or js.

eleinadani commented 5 years ago

We recently published this article where we summarize of the main aspects of Graal.js' threading support. I am therefore closing this issue, but feel free to re-open if you have further questions.

Cormanz commented 3 years ago

I think at the very least, for the Context API, you should make this some sort of experimental option, as well as for sharing objects between contexts.

kran commented 4 months ago

this is the only reason that stop me from migrating nashorn to graaljs.