denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
96.63k stars 5.34k forks source link

Make resources implement Transferable #8341

Open jcc10 opened 3 years ago

jcc10 commented 3 years ago

Related to #5965 (and possibly others)

It would be really nice if resources could be Transferable between web workers. This would allow for proper multi-threading for HTTP and WebSocket servers.

This can provide multiple benefits:

There are probably others but those are the reasons I can think of.

This is probably not high up on the priority board, but it would be very nice if only for the use case of WebSockets.

ghost commented 3 years ago

I actually don't think that anyone has mentioned transferring objects yet, although structured cloning has been mentioned multiple times.

https://github.com/denoland/deno/issues/6887 https://github.com/denoland/deno/issues/6433 (This has the closest connection to this) https://github.com/denoland/deno/issues/3557 (mentions ArrayBuffers, thus implicitly references Transferable) https://github.com/denoland/deno/issues/3317#issuecomment-553281641 (Directly references Transferable)

Fortunately, it's on the current roadmap! https://github.com/denoland/deno/issues/7915

jcc10 commented 3 years ago

Just to be clear, this thread is specifically for resources which probably don't fit within the structured clone feature. (Can't really clone a connection, just transfer it.)

For both HTTP and WS, the main underlying item needed to be made transferable is interface Deno.Conn extends Reader, Writer, Closer everything else seems to be on top of that. (I think)

kitsonk commented 3 years ago

I am not certain the concept of Transferable in the web API would be the best way to do what you are suggesting.

Sharing bindings to resources between workers in Deno would more likely be accomplished by ensuring that the resource table is shared across workers (it may already be, I am not totally familiar with it myself, but @bartlomieju could answer). If the resource table is shared, then it is just a matter of using the same resource ID across the worker boundaries. If it isn't shared then the suggestion would be to have a setting that ensures they are shared.

lucacasonato commented 3 years ago

I personally think transferring resources is a lot more valuable than sharing a resource table. With transferring you get to do very fine grained access control and task separation (e.g. open a file in parent worker, then transfer to unprivileged worker to read and write to only that file). With a shared resource table there could potentially be different workers with the different permissions operating on the same resource. That seems rather confusing.

jcc10 commented 3 years ago

So. I had to make a test just to see how things work as is:

https://gist.github.com/jcc10/a88cf952093df2286c703f172c99edbd

Under the hood it can still be a shared table. For the fail tolerance you could even let references exist in multiple places. But you need to be really careful in the documentation with noting how iterators can get into weird race conditions. (As shown in the gist.) (And by weird I mean most people (IMO) would have to check to see what the actual behaviour is as opposed to simply intuiting it.)

With the right framework, I can actually see this as being a plus, With a handful of threads handling the I and the rest handling the O of I/O. It would certainly allow for more flexible design. But at the same time it will be very easy to mess up.

jcc10 commented 3 years ago

@lucacasonato With a shared resource table there could potentially be different workers with the different permissions operating on the same resource. That seems rather confusing.

If each call for opening a resource loaded a different instance of the resource in the shared resource table would that help?

Psudo-Code:

let x = open("./file1");
let y = open("./file1");
x.resourceID == "uuid:1";
y.resourceID == "uuid:2";

So two resources, two variables, one file. The transferring only really ensures that you can't have two references to the same instance in different threads. (Which admittedly could be bad and possibly cause Undefined Behaviour from race conditions on read/writes.)

How would Rust handle this? It's supposed to be thread safe so this really is somewhere copying their design philosophy may come in handy.

lucacasonato commented 3 years ago

@jcc10 If each call for opening a resource loaded a different instance of the resource in the shared resource table would that help?

That is already the case. Also I don't think we need a shared resource table. Resources should be owned by the thread (worker) that is going to use them. That way no two threads can try to use the same resource at once. That means each worker gets their own resource table (which is what we do currently).

Race conditions are inherently part of multi-threaded code, and the programmer has to manually to prevent them by coordinating threads (i.e. through mutex locks). Same applies to pure web code like SharedArrayBuffer. Opening the same file from two different threads would yield two file descriptors which can be used to provoke race conditions. This is already the case right now. I don't think there is anything Deno can do to solve this.

ghost commented 3 years ago

Opening the same file from two different threads would yield two file descriptors which can be used to provoke race conditions. This is already the case right now. I don't think there is anything Deno can do to solve this.

Might I suggest that Deno checks if a file is already open anywhere in the program, and then throw an error if it is already open? Actually... I'm not sure if that's even a good idea.

bartlomieju commented 3 years ago

Sharing bindings to resources between workers in Deno would more likely be accomplished by ensuring that the resource table is shared across workers (it may already be, I am not totally familiar with it myself, but @bartlomieju could answer).

It's not, each worker gets it own resource table. If we were to support transferring resources we should take that into account in the upcoming resource table refactor, but first it needs more discussion on what might be the implications in implementation.