[Retracted due to excessive complexity] $200 - Make a JS-target implementation of *.vm.Thread using Web Workers

6J7KZg2f commented 5 years ago

Edit: I am retracting this due to the comment discussion, which has led me to think that it will likely result in at best an implementation that requires from-the-beginning consideration of Javascript limitations, rather than being a drop-in solution for JS target addition to existing projects.

~~Sorry for the Readme edit, couldn't make a PR without a diff...~~

~~Background:~~

Haxe currently has inter-compatible threading support on Java, Hashlink, C++, and Neko targets. It is generally possible to compile the same threaded haxe program for each of these targets, using only conditional imports such as

#if java
import java.vm.Thread
#elseif windows
import cpp.vm.Thread
#elseif neko
import neko.vm.Thread
#end

~~However, there are other async-supporting platforms, such as Flash and JS, that do not have this *.vm.Thread structure available.~~

~~This bounty is to add a *.vm.Thread-compatible structure to the JavaScript/HTML5 target, implemented using WebWorkers, such that adding~~

#elseif js
import js.vm.Thread

~~to the above structure in an otherwise-compatible project will work as expected when compiled for JS target.~~

~~Prerequisites:~~ ~~You're going to need a pretty intimate understanding of both Haxe and Javascript. I have a feeling Haxe macros will be involved to some degree.~~

~~Out of scope:~~

~~No other class in *.vm is in the scope of this bounty. Things like Dequeue, Lock, Mutex, and so on, while great, are not in scope.~~
~~There is no performance target, within reason.~~

~~Other targets handle, with different degrees of reliability, shared objects with no mutex or locking. For example, you can do the following on Neko:~~

var foo:String = "s";
Thread.create(function(){
while(true){
    var n:String = Std.string(Std.int(Math.random() * 10));
    foo = foo + n;
    trace("thread: " + n);
    trace("threads: "+foo);
    Sys.sleep(0.5);
}
});
while(true){
var n:String = Std.string(Std.int(Math.random() * 10));
foo = foo + n;
trace("mainth: " + n);
trace("mainths: " + foo);
Sys.sleep(0.7);
}

~~And it will give:~~

thread: 0
threads: s0
mainth: 9
mainths: s09
thread: 1
threads: s091
mainth: 9
mainths: s0919
thread: 8
threads: s09198
mainth: 6
mainths: s091986
thread: 6
threads: s0919866
...etc

Which is very well-behaved, but I don't think anyone uses such techniques in production code, and I believe due to the sandboxed nature of web workers may be very difficult to implement, so I am tentatively calling it out-of-scope for this bounty. In that, the above example must compile and run, but has no responsibility to syncronise s between threads, and may yield

thread: 0
threads: s0
mainth: 9
mainths: s9
thread: 1
threads: s01
mainth: 9
mainths: s99
thread: 8
threads: s018
mainth: 6
mainths: s996
thread: 6
threads: s0186
...etc

~~without being considered unacceptable.~~

~~Explicitly in scope:~~

~~All public and public static methods of *.vm.Thread must be implemented - sendMessage(), readMessage(), current(), and create().~~
~~readMessage must handle its block:Bool argument correctly.~~
~~send/readMessage must accept primitives, enum instances, typed objects, and dynamics, just as it does on the other targets.~~
~~create() must accept a regular Void->Void, which can be declared inline.~~
~~trace() must trace to the console regardless of which thread it is called from.~~
~~A thread crashing alone must not crash the rest of the program or the browser. Ideally, exceptions should bubble correctly, but no other target does this properly, so concessions can be made.~~
~~In order to maintain parity, a Sys.sleep(timeInSeconds:Float) method needs to be made available for JS. Since there is no performance requirement, this could just be a spinlock.~~
~~The goal is to have this functionality merged into Haxe as part of the core JS target functionality.~~

~~Considerations/prior development/gotchas:~~ ~~Creating web workers in Haxe is already possible. There is an example here:~~ ~~https://gist.github.com/cambiata/be0c2dc499da4be71151~~ ~~However it requires the worker to be compiled separately~~

This example from haxe.org: https://code.haxe.org/category/javascript/javascript-inline-workers.html shows a method for creating inline workers, saving some of the effort that would need to be spent ensuring all dependencies are included between scripts and so-on. ~~However, a naive implementation would limit a project to only spawning one thread, which would not be acceptable.~~

~~Browser compatibility~~ ~~This should work on all major browsers, but trace need only work correctly on Chrome (I understand there are issues logging to the dev console from web workers on other browsers).~~

~~How much can Skerper help?~~ I am happy to test, do code reviews, and help plan/discuss/consult. I can help out with code on the pure Haxe side, but I am not familiar enough with JS or macro target to be helpful on those fronts. I'm quick to respond and can be reached via email, github, discord, or Line.

~~Timeframe:~~ ~~I'm not familiar enough with javascript or haxe macros to know how long this would take. 'ASAP' would be nice :)~~

~~Budget:~~ I have US$200 set aside for this bounty, which I will pay via PayPal to the person or persons that demonstrate their implementation's compatibility with otherwise-compatible projects. In the event that multiple people choose to work together on it, we can discuss how the payment should be split. A partial solution may qualify for a partial payment.

6J7KZg2f commented 5 years ago

It would probably be beneficial to write some test cases for an eventual implementation, though unit testing is not something I have much experience with.

mikedotalmond commented 5 years ago

It's been a few years since I used it, but as far as I remember this worked well for me, and is another example of a worker implementation that could be used as a starting point/reference for someone working on this issue: https://github.com/Rezmason/Golems

elsassph commented 5 years ago

@Skerper that's quite a challenge to do as it requires to split the compiler output to emit the worker JS code separately. Haxe compiler can't do it atm, however through hxgenjs or Modular there may be ways to do it without running the compiler through macros.

It would be helpful to provide some test cases with example worker logic resembling real world use cases in order to "demonstrate their implementation's compatibility with otherwise-compatible projects".

elsassph commented 5 years ago

I looked again at the respective APIs and I'm afraid the difference of semantics (readMessage is blocking and would require something like JS' await to achieve) make it more suitable to implement a worker-like API (e.g. Golems) using Workers and Threads than the opposite.

6J7KZg2f commented 5 years ago

@elsassph

@Skerper that's quite a challenge to do as it requires to split the compiler output to emit the worker JS code separately.

Indeed, if it were within my current ability to do, I'd be working on it myself. I assumed that was the likely workflow, but I don't really know where to begin with making Haxe do that.

It would be helpful to provide some test cases with example worker logic resembling real world use cases in order to "demonstrate their implementation's compatibility with otherwise-compatible projects".

You are correct. I will knock up some test cases.

I looked again at the respective APIs and I'm afraid the difference of semantics (readMessage is blocking and would require something like JS' await to achieve) make it more suitable to implement a worker-like API (e.g. Golems) using Workers and Threads than the opposite.

As I said, I don't mind if readMessage(true) just spinlocks the thread while busy-waiting for a reply. I agree that the web worker api is more suited something like the Golem implementation, but haxe currently has

java.vm.Thread
neko.vm.Thread
hl.vm.Thread
cpp.vm.Thread

Which are all functionally identical, so JS (and Flash, for what it's worth) is the odd one out here among threaded targets. I would rather JS be bent to conform to what appears to be becoming the de facto 'haxelike way' of Threading, as opposed to having the rest of the targets comform to JS's idiosyncratic workflow.

elsassph commented 5 years ago

You can make a non-blocking API using a blocking one, but not the opposite :) If you just while-true the worker will not receive any events (and browsers likely won't be happy).

As I said there may be ways to achieve it by using async/await which hopefully would be nicely supported in browsers - otherwise the exercise of rewriting synchronous code into asynchronous is quite a challenge.

6J7KZg2f commented 5 years ago

@elsassph

You can make a non-blocking API using a blocking one, but not the opposite :) If you just while-true the worker will not receive any events (and browsers likely won't be happy).

As I said there may be ways to achieve it by using async/await which hopefully would be nicely supported in browsers - otherwise the exercise of rewriting synchronous code into asynchronous is quite a challenge.

This is not really where I expected the sticking point to be. If I allow the caveat that the argument of Thread.read_message(block: Bool) must be a compile-time constant, could the blocking issue be solved via the following kind of code generation?

The following Haxe:

Thread.create(function(){
     trace("hello");
     var msg = Thread.read_message(true);
     trace("hello again "+msg);
     msg = Thread.read_message(true);
     trace("final hello "+msg);
});

Would generate code in the worker along the lines of this: https://jsfiddle.net/nL46mcqu/

I do not believe that restricting block to a compile-time constant would be a major impact, as I don't believe there are many use cases that involve deciding whether to block at runtime.

elsassph commented 5 years ago

With your single flat function example yes it wouldn't be too difficult to rewrite the code using macros, but a generic solution is a hard problem: imagine that with deep conditional code, inside a class instance method used in the worker code (and maybe this class is used in non-worker code!).

That's just how it is: JS has extremely strict restrictions on worker execution and data passing - we didn't even start discussing how intensely painful it is to exchange non-trivial serialiable data between a worker and its host 😨

Ultimately that's why people use high-level game engines and don't depend on direct opengl code, and that's why I'm asking about your use cases, because a fully compatible Thread implementation is unreasonable, unless maybe someone wants to spend their CS PhD solving this problem.

Aurel300 commented 5 years ago

I agree with @elsassph here. To support the shared memory concerns:

On the Neko and C++ targets, the programmer has a lot of control over the specific allocation of memory. It is possible to access the same exact memory regions from different threads, which is really why various synchronisation constructs (the other neko.vm.* classes) are necessary.

On Java different threads can also access the same memory, and again, if done carelessly without using synchronisation constructs, this leads to issues.

On JavaScript the situation is a bit different. Haxe compiles to JavaScript nicely because a lot of the types map one-to-one (Int and Float to Number, String to String, arrays to arrays, class instances to objects…). JavaScript is garbage collected, supports anonymous functions, etc etc, so relatively little work is needed to adapt Haxe runtime to the target. Web workers then present a serious problem because the API is not at all similar to Haxe threads.

Web workers cannot share arbitrary JavaScript memory (they are more like processes than threads). The runtime objects, dynamic arrays, etc, are all constrained to a single memory space. If you send them over postMessage, they are not referenced, but cloned – i.e. a modification made in the worker code will not be visible to the host. If you google "web workers shared memory", you'll find that there are these things called shared array buffers. These are indeed possible but in the current state of the Haxe JS target useless to us.

IF we had a WASM target (something I really want actually), the situation would be slightly different, since on the WASM target we would need to manage memory with the Haxe runtime, similarly to Neko. Then sharing memory between threads would just mean sharing the one array buffer that represents the entire memory space of a WASM module. I haven't tested this so don't quote me on that :D But anyway, WASM is far from an easy thing to target, because it is a very low-level target and also exactly because of the memory management needed.

So for this bounty, I wouldn't expect an implementation fully compatible with the Neko/C++/Java threads. The best you can hope for is probably a library that will work for some specific use cases. It may spawn the worker from a closure or a function reference (maybe), but variables outside the function won't be accessible. The message posting might be wrapped in a more Haxe-like interface, perhaps using Haxe Serialisation for transit.

6J7KZg2f commented 5 years ago

Alright, I'm going to retract this one then, as it seems unlikely to result in the stated goal of generalized drop-in compatibility with threaded targets.

Something that looks identical externally but actually has considerable differences/limitations in practice is antithetical to what I was hoping to achieve, and I would like to avoid something like that being made available.

larsiusprime / larsBounties

[Retracted due to excessive complexity] $200 - Make a JS-target implementation of *.vm.Thread using Web Workers #6