servo / servo

Servo, the embeddable, independent, memory-safe, modular, parallel web rendering engine
https://servo.org
Mozilla Public License 2.0
28.21k stars 3.01k forks source link

Implement SharedWorker #7458

Open jdm opened 9 years ago

jdm commented 9 years ago

https://html.spec.whatwg.org/multipage/workers.html#shared-workers-introduction

https://html.spec.whatwg.org/multipage/workers.html#shared-workers-and-the-sharedworkerglobalscope-interface https://html.spec.whatwg.org/multipage/workers.html#shared-workers-and-the-sharedworker-interface

Depends on #7457.

Ms2ger commented 9 years ago

I believe just Gecko implements this now, so it's not entirely clear this is going to stay in the platform.

jdm commented 9 years ago

http://caniuse.com/#feat=sharedworkers says Chrome does too.

tetsuharuohzeki commented 9 years ago

In my feeling as a client-side web content developer, SharedWorker is ignored in web development scene now. I don't know why its feature is ignored, I suspect these things Firefox have implemented it in v29 and Safari drops it from v6.

I agree @Ms2ger 's thought. We can wait the result of Firefox's telemetry about SharedWorker https://bugzilla.mozilla.org/show_bug.cgi?id=1193414 .

jdm commented 8 years ago

That being said, ServiceWorkers would use a lot of the same infrastructure as SharedWorker, so this could still be valuable.

NekR commented 8 years ago

Sometime ago I heard that SharedWorkers are deprecated. Right ServiceWorker spec team considers using SharedWorkers inside ServiceWorker for heavy tasks / shared memory.

So you may still need SharedWorkers again some day.

avinash-vijayaraghavan commented 7 years ago

@jdm this looks interesting. would like to try it out if it is a mentored project

jdm commented 7 years ago

@avinash-vijayaraghavan There's no clear benefit to working on it at this point in time, so I don't think anyone will want to put in the time to mentor it.

gterzian commented 5 years ago

Couple of design ideas, based on current ongoing work at https://github.com/servo/servo/pull/23637:

  1. Constellation owns a HashMap<(Origin, Url), SharedWorker>(or just origin -> worker) .
  2. A SharedWorker is similar to an EventLoop, essentially a wrapper around an IPC sender and a pipeline Id(or a SharedWorker is a process that runs one worker-event-loop per Url in threads).
  3. The differencen is that a SharedWorker runs a(or several) worker event-loop(s), not a ScriptThread. A "worker event-loop" is essentially the run_worker_scope method of DedicatedWorkerScope, give or take.
  4. So when script does new SharedWorker(scriptURL [, name ]), it:
    • Creates two message ports and entangles them.
    • Stores one onto the local global and a SharedWorker dom object.
    • Transfers the other, and sends it to the constellation in a message along with relevant info.
  5. The constellation, upon receiving such a message:
    • Either creates a new worker or reuses an existing one.
    • In both cases, it ends up forwarding the message containing the port to the worker.
  6. The SharedWorkerScope then runs some glue code to handle the new port, fire the connect event, and then essentially does a self.upcast::<GlobalScope>.track_message_port(port).

So hopefully a SharedWorkerScope can be basically a copy/paste from DedicatedWorkerScope, with some additional glue to handle new incoming ports, which really should just call into the methods of GlobalScope that handle message-ports in any other scope.

At that point there is a direct IPC communication line, in the form of the port, between a script-process, and a shared-worker process, while the constellation keeps ipc-senders to both, and both also have an ipc-sender to the constellation.


Note that this "storing the worker on the constellation and running it in it's own process, while making it available to any same-origin script", could be a way to re-structure service workers too.


So, reading up on this issue, I'm not sure if it's a very high-value project in itself, however I do think testing the design I sketch above could be a valuable lesson to potentially refactor service-workers(see https://github.com/servo/servo/issues/19302).

Also, the hard parts are MessagePort and "running a worker"(basically dedicated worker but out-of-process from script and with some coordination from the constellation).

So once we have message-ports, it's basically re-using those as well as dedicated worker, with a bit of additional constellation glue. I don't see this being a very large project in and of itself.

And, it's not like it up for deprecation or anything(?), so with message-ports and dedicated worker in place, it'd be a pity not to go ahead and finish this one too.

gterzian commented 5 years ago

One more thing: Service and Shared workers also differ in that Service workers are meant as stateless, event-driven, never-block kind of components that potentially run independently of any web page, whereas Shared workers are really meant to share state(and centralize running some logic to produce that state) between different pages running the same origin.

I think it's worth it to implement Shared workers, since 2015 there has been quite a lot of progress on the compatibility side: https://developer.mozilla.org/en-US/docs/Web/API/SharedWorker#Browser_compatibility

It looks like only Safari and Android Webview do not implement it.

CYBAI commented 5 years ago

It looks like only Safari and Android Webview do not implement it.

About WebKit, they implemented it but removed it. https://webkit.org/status/#feature-shared-web-workers

Not read yet but maybe related WebKit issues: https://bugs.webkit.org/show_bug.cgi?id=116359 https://bugs.webkit.org/show_bug.cgi?id=140344

gterzian commented 5 years ago

Thanks for sharing those issues. https://bugs.webkit.org/show_bug.cgi?id=116359 is particularly interesting. It actually seems that there are people complaining about not having this feature being consistently implemented across browsers.

Shared workers can not communicate across processes currently, so they should better be disabled.

That seems to have been the origin reason for the removal.

the engineering resources spent to make shared worker work is the engineering resource that can't be spent elsewhere like implementing service workers so there is a huge opportunity cost associated with making this feature work across web content processes.

And this is an argument I disagree with, since shared-workers are basically re-using dedicated workers but with additional infra, and that additional infra is the same stuff that service workers would be using(minus the message port).

What makes service workers a lot of extra work is orthogonal to shared-workers, and they could otherwise share a common infrastructure.

gterzian commented 5 years ago

Also, I think this would fit very well in the overall design of Servo, hence shouldn't be exactly very hard or a lot of resources redirected from other things.

And having a worker run in a separate process from script, using a message-port as the communication channel between that worker and multiple potential script-process clients, is generally a problem that if solved once could open up other opportunities, such a discussed in https://github.com/servo/servo/issues/23807 (and on how to do Service workers as well).

Currently our only implementation of workers(dedicated workers) consists of running a thread right inside a script-process. So we haven't tried anything related to running a worker in a separate process, separating the life-cycle of a worker from that of a given script-thread(process).

It would be much easier try this with SharedWorker first, since you can re-use alsmost the entire "run a worker" logic from dedicated workers, than trying to do this first with Service worker or worklets.

Hence I see this mainly as a good first pass at introducing a new kind of component in Servo, the out-of-process worker, which could then be generally re-used in other more complicated workers.

gterzian commented 5 years ago

This is worth reading: https://docs.google.com/presentation/d/1GZJ3VnLIO_Pw0jr9nRw6_-trg68ol-AkliMxJ6jo6Bo/edit#slide=id.g36f61837b7_1_69

pshaughn commented 4 years ago

Lots of basically-unrelated WPT test trees show failures just because they ask "can I also do this from inside a SharedWorker?" for whatever question they're asking.

gterzian commented 4 years ago

Ok I'd like to actually do this one, the question is, will there be anyone interested in reviewing it :)

I think it might be relevant for the WPT tests as noted by @pshaughn, especially if some of those tests are using a shared worker to test other features(meaning we can't get coverage without it).

Also, it's in the main HTML standard, so I don't think it's some sort of optional feature of the Web.

Finally, I think implementing this would require a design similar to what I think is necessary for ServiceWorker to fix https://github.com/servo/servo/issues/15217(having a separate process per origin, and going via the constellation), and in some way since Sharedworker is more of a blank slate at this point, it would be easier to try it out here and then apply the same design(if successful) to Serviceworker.

gterzian commented 4 years ago

Ok so for example Chromium initially did what I was proposing, running the shared worker in it's own process, but then they moved away from that towards running it in the same process as the first page that uses it, and then connecting other pages in other processes(but the same origin) with the worker in that first process, and keeping that first process alive even if the page goes away.

See https://docs.google.com/document/d/10P4lTgIUz8ujB3b0HRhPFhsGBVioyJg3_5uJj3Kotd4/edit#heading=h.538npkvkx38h

So I guess we can do something like that as well.

We could still use the constellation for the overall "connect to a worker or create it" flow, but run the worker in the first EventLoop that needs it, and then keep the EventLoop alive even if all pipelines are gone, as long as there is a pipeline somewhere in another EventLoop still using the shared worker.

This could also apply to the fix for Service worker for #15217

gterzian commented 4 years ago

Ok continuing the discussion here, which applies both to Service and Shared workers:

I've made a start at https://github.com/servo/servo/pull/26073, and the more I think about it, the more I think this can easily turn into a huge source of problems, for the following reasons:

  1. It's complicated. If we run Service or Shared workers in a content process, then there are all sorts of source of race conditions when it comes to forwarding message to the worker.

Consider this:

Again, it can certainly be done, but to me it would appear about as complicated as what we had to do for transferring MessagePorts, and that one was actually required by the spec, while this would be essentially an optimization to shave a process off.

  1. The spec puts Shared and Service workers in their own agent-cluster, which means they can't share memory with a page using them. So while "their own agent-cluster" doesn't necessarily means "their own process", it's certainly the most straight-forward way to implement it. So I think trying to optimize away a process by co-locating the workers(who per the spec are in their own agent-cluster) in the process running the script-thread, is simply not worth it.

Consider the simple alternative, that comes with the cost of an additional process:

Process A wants to start using a Service/Shared worker for a given origin, sends a message to the constellation, the constellation, upon receiving the message, inspects it's local state and either forwards the message to an existing running worker manager for the origin, or spins one up and then forwards the message.

This has the huge benefit of simplicity, with little, perhaps none at all, potential for race conditions. Also, it matches the agent-cluster concept in the spec nicely.

Also let's not forget about the whole "service workers sometimes have to run even when there is not page active", for example to handle an incoming notification. Having the workers run in their own process would facilitate a potential implementation of such a feature.

And finally you get proper isolation of the EventLoop code that run the script-thread, potential dedicated workers and worklets, versus the code in other processes running Shared and Service workers. Actually come to think of it, one optimization that perhaps could make sense would be to run Service and Shared workers for a given origin in the same process, versus two different processes...

jdm commented 4 years ago

I agree that using separate processes is conceptually simpler, and I think that's important from an implementation and maintenance point of view.