mozilla-mobile / fenix

⚠️ Fenix (Firefox for Android) moved to a new repository. It is now developed and maintained as part of: https://github.com/mozilla-mobile/firefox-android
https://github.com/mozilla-mobile/firefox-android
Mozilla Public License 2.0
6.47k stars 1.27k forks source link

Decide when to create the Gecko runtime #4367

Closed hawkinsw closed 4 years ago

hawkinsw commented 5 years ago

There is a fundamental decision to make regarding when to instantiate the Gecko runtime.

There are three different options:

  1. On application startup In this option, the Gecko runtime is instantiated immediately upon starting the application. In other words, in FenixApplication.setupApplication(). The benefit to this approach is that the price of instantiation is paid early and the user will not have to pay that price when they, eventually, need to use the runtime for some reason. The downside is that the time to complete the application startup will be longer.

  2. On demand
    In this option, the Gecko runtime is only instantiated when the application needs it. For example, if the user opens the application and wants to start a new session, there is no need for the Gecko runtime. The user would only need the Gecko runtime, in this case, when they want to browse a website. The benefit to this approach is that the user does not pay the price of starting up a component that they do not need. The downside is that they are penalized (for the time required to start the runtime) at the very moment when the want to view a website (ostensibly the reason that a user opens a web browser).

  3. Hybrid In this option, the Gecko runtime is instantiated at the first moment the application is idle. For example, if the user opens the application to the home screen, as soon as it is rendered and the user is "thinking", the Gecko runtime is instantiated. The benefit of this option is that the user will not pay the price for instantiating the Gecko runtime during their interaction with it. The downside is that it is complex to implement and, ultimately, may not be that useful (because there may not be any idle period between when the user launches the application and requires the runtime).

In order to make the decision among these options, there are at a minimum two important questions to consider:

  1. How often will the user open the application just to see the home activity?
  2. How often will the user open the application to interact with a web page?

@csadilek @vickychin @colintheshots @pocmo

┆Issue is synchronized with this Jira Task

pocmo commented 5 years ago

There's one important thing to consider here: Gecko is not only used for loading and rendering websites anymore. We also use it as our HTTP client - which means it is also indirectly needed for things like for example sending telemetry (glean), fetching the latest switchboard config, loading website icons that are no longer in the cache, ...

Given that constraint I feel like there's almost nothing to be gained by delaying loading Gecko. I also think the dependencies on Gecko being started will increase (web extensions, push messages, ..) rather than decrease going forward.

pocmo commented 5 years ago

CC @snorp @agi90 for input from the GV side.

jesup commented 5 years ago

I'd suggest deferring GV startup (for browser launch, not for applink/etc cases) until either the app is 'idle' (whatever that exactly is defined as for this purpose), or on the first request to use it (HTTP). Related, we should look at non-browsing actions that may invoke HTTP accesses and decide if those are deferrable until idle (or some later point), and if so do that.

csadilek commented 5 years ago

Thank you, @hawkinsw. Can you share some numbers with us? What is the performance impact you see when initializing the Gecko runtime on startup (vs. delaying e.g. to first page load as in Fenix 1.0?).

How often will the user open the application just to see the home activity?

I'd expect that would be pretty rare, as likely they'd want to load a page / or clicked on a link?

My main concern is to (re-)introduce non-deterministic or complex bootstrap logic which I'd like to avoid, if possible :).

ncalexan commented 5 years ago

@pocmo: re: https://github.com/mozilla-mobile/fenix/issues/4367#issuecomment-516342689, are we using Necko for fetching, say, experiment/FretBoard A/B testing stuff? If that's true then damn the torpedoes: let's start Gecko immediately.

If that's not true, then I think we should do 3), with less non-determinism: wait for the very first UI paint then start Gecko. (That's close to deterministic.)

ncalexan commented 5 years ago

It's worth noting that I want Fenix to expect Gecko to not always be available, i.e., to be lazy/possibly null. It's very easy to accommodate this up front and almost impossible to retro-fit, and we expect to have a "remote GeckoView"/Gecko in a separate process eventually, which looks very similar to possibly null GeckoRuntime.

snorp commented 5 years ago

I don't think I really have an opinion here, as you should do whatever suites Fenix best. My intuition is that you want to start a GeckoRuntime ASAP since you'll likely need it anyway and it takes a non-trivial (.5s or so?) amount of time to start. That may not be backed up by perf metrics, though.

agi90 commented 5 years ago

I also don't really have an opinion (first paint sounds nice, but it really depends on usability). I just want to point out that on slow android phones that have been in use for a while app startup like GeckoView can take seconds. So whenever we decide to launch GV it might introduce a really long stall in the app.

pocmo commented 5 years ago

@pocmo: re: #4367 (comment), are we using Necko for fetching, say, experiment/FretBoard A/B testing stuff?

Yes, those components all use concept-fetch and we are using the GeckoView implementation since the recommendation was to have all requests go through our network stack. The only exception right now are third-party dependencies like Sentry or Leanplum that we can't bend to use something else that easily.

I just want to point out that on slow android phones that have been in use for a while app startup like GeckoView can take seconds. So whenever we decide to launch GV it might introduce a really long stall in the app.

Is that worse than Fennec or the same? If I remember correctly Fennec will start Gecko immediately too....?

agi90 commented 5 years ago

Is that worse than Fennec or the same? If I remember correctly Fennec will start Gecko immediately too....?

I don't think it makes a difference, any app on my old S7 takes ages to start up. I just wanted to point out that while 0.5s stall on a modern device once in a while might be acceptable, 3s is really jarring, especially because on clogged devices apps are evicted really frequently. On my S7 I can barely keep one app alive at the same time.

hawkinsw commented 5 years ago

@pocmo: re: #4367 (comment), are we using Necko for fetching, say, experiment/FretBoard A/B testing stuff? If that's true then damn the torpedoes: let's start Gecko immediately.

As far as I understand it, this is going to be the m.o. going forward. There is a special push (for security purposes) to use a single network stack in the application (which will obviously be the one in Gecko). Therefore, we will use Necko for fetching experiments which is done very early in the application startup process. Given this, I believe that your vote and suggestion is reasonable.

If that's not true, then I think we should do 3), with less non-determinism: wait for the very first UI paint then start Gecko. (That's close to deterministic.)

hawkinsw commented 5 years ago

Is that worse than Fennec or the same? If I remember correctly Fennec will start Gecko immediately too....?

I don't think it makes a difference, any app on my old S7 takes ages to start up. I just wanted to point out that while 0.5s stall on a modern device once in a while might be acceptable, 3s is really jarring, especially because on clogged devices apps are evicted really frequently. On my S7 I can barely keep one app alive at the same time.

Thank you for pointing this out. I think that we need to be very aware of performance on not-high-end devices. However, the direction that I have received from above is that we are focused on a particular set of high-end devices used by conscious choosers.

hawkinsw commented 5 years ago

It's worth noting that I want Fenix to expect Gecko to not always be available, i.e., to be lazy/possibly null. It's very easy to accommodate this up front and almost impossible to retro-fit, and we expect to have a "remote GeckoView"/Gecko in a separate process eventually, which looks very similar to possibly null GeckoRuntime.

Should we go with the on-demand option, I 100% agree. It would need to be ingrained in application developers that getting the runtime is a privilege that should be cherished and only require the runtime when it is absolutely necessary. Like you said, this is easy to do when creating new features but very difficult to retrofit.

hawkinsw commented 5 years ago

There's one important thing to consider here: Gecko is not only used for loading and rendering websites anymore. We also use it as our HTTP client - which means it is also indirectly needed for things like for example sending telemetry (glean), fetching the latest switchboard config, loading website icons that are no longer in the cache, ...

Given that constraint I feel like there's almost nothing to be gained by delaying loading Gecko. I also think the dependencies on Gecko being started will increase (web extensions, push messages, ..) rather than decrease going forward.

Thank you for the input!

hawkinsw commented 5 years ago

On @csadilek's urging, I went through and did a refresh of my evaluation of the current state of startup performance. These are findings from nightly Fenix on a Pixel 2.

  1. Roughly 5% of startup is spent initializing the Gecko runtime.
  2. No fewer than three different components necessary to complete startup to the Home Screen require the Gecko runtime. Coupled with the requirement that the Gecko runtime be initialized from the main thread, removing the reliance on the Gecko runtime from those components will require a significant amount of software engineering and, ultimately, may not even be possible.
  3. There are myriad optimization opportunities in addition to delaying the Gecko runtime initialization (e.g., loading the default search engine list takes roughly 12% of startup and the inflation of UI layout/elements is very expensive).

I've tried to write those three quick observations as neutrally as possible. @csadilek and I discussed these facts and came to a conclusion about how we would decide but do not want to bias the group's thinking.

The group's responses to this thread seem to indicate

  1. A general split between those who would campaign for instantiating the runtime at the first idle moment (but before a page is rendered) and those who would advocate for taking the performance hit and starting the Gecko runtime immediately.
  2. A realization that many Fenix components already rely on the runtime and many more will do so in the future.
  3. A desire for developers to make judicious use of the Gecko runtime in components that are not explicitly rendering web pages for the user.

I hope that this is an accurate summary of the discussion above. Please comment if there are strong opinions that I did not record.

pocmo commented 5 years ago

and we expect to have a "remote GeckoView"/Gecko in a separate process eventually, which looks very similar to possibly null GeckoRuntime.

@ncalexan Can you elaborate on that? The thing I'm wondering is: Once we have Gecko in a separate process and GeckoRuntime in the main process not doing much anymore: Wouldn't it be possible for "GeckoView" to control when to create this separate process (lazily or when explicitly requested)? Which would take away a lot of the responsibility from the app. But I understand that this is a long term goal and we may be looking for short-time improvements here.

Also: Right now a GeckoRuntime needs to be created on the main thread - and all network requests are never on the main thread. So lazily creating GeckoRuntime with a thread context switch is pretty awkward.

hawkinsw commented 5 years ago

Also: Right now a GeckoRuntime needs to be created on the main thread - and all network requests are never on the main thread. So lazily creating GeckoRuntime with a thread context switch is pretty awkward.

I agree that this is really important -- thank you for reiterating it. It was this awkwardness to which I was referring as among the reasons why instantiating the Gecko runtime lazily would require a significant amount of software engineering resources.

Thanks again for emphasizing it!

ncalexan commented 5 years ago

and we expect to have a "remote GeckoView"/Gecko in a separate process eventually, which looks very similar to possibly null GeckoRuntime.

@ncalexan Can you elaborate on that? The thing I'm wondering is: Once we have Gecko in a separate process and GeckoRuntime in the main process not doing much anymore: Wouldn't it be possible for "GeckoView" to control when to create this separate process (lazily or when explicitly requested)? Which would take away a lot of the responsibility from the app. But I understand that this is a long term goal and we may be looking for short-time improvements here.

Sure. If we had a separate Gecko parent process, then it's true that the GeckoRuntime API instance in the app process could always be non-null. I expect that consumers of that instance will, eventually, want to know if Gecko is running and accommodate cases when it is not. (I.e., not every problem fits into the pattern we have now where start Gecko on demand and enqueue the relevant request to be handled after Gecko is loaded). I agree that the consuming app itself would have less work to do, and that this would be good -- but there's still a place for considering what happens when Gecko isn't running. (Not least: what do you do when Gecko is repeatedly crashing?)

Also: Right now a GeckoRuntime needs to be created on the main thread - and all network requests are never on the main thread. So lazily creating GeckoRuntime with a thread context switch is pretty awkward.

This doesn't feel like a particularly hard blocker. GV should probably just keep the requirement, since doing otherwise is not a zero-cost abstraction; engine-gecko can easily get the thread context switch right for all subsequent consumers.

mcomella commented 4 years ago

afaik, a decision was made a while back to not change the current behavior because Gecko is needed for basically everything and a lot of complexity is added to do this so it's not worth the change. Closing this issue: we can reopen if we decide to revisit this decision.