dart-lang / pub

The pub command line tool
https://dart.dev/tools/pub/cmd
BSD 3-Clause "New" or "Revised" License
1.04k stars 224 forks source link

Big perf difference between running Dart app via pub global activate, or pub run #1349

Closed sethladd closed 8 years ago

sethladd commented 8 years ago

Hi! We'd like to start using pub run to launch our scripts from within our app, but it seems much slower than scripts launched after pub global activate. The startup perf is enough for us to wonder if we're doing something wrong, or if there's a bug.

tldr: app runs in 1 sec via pub run, and 0.3 seconds after pub global activate.

Here are the scenarios:

A) add flutter: any to an empty pubspec, run pub get, then run pub run flutter list --help B) pub global activate flutter, then run flutter list --help

Results from scenario A:

~/tmp/flutterpkg$ time pub run flutter list --help
List all connected devices.

Usage: flutter list [arguments]
-h, --help       Print this usage information.
-d, --details    Log additional details about attached devices.

Run "flutter help" to see global options.

real    0m1.062s
user    0m0.921s
sys 0m0.182s

Results from scenario B:

~/tmp/flutterpkg$ time flutter list --help
List all connected devices.

Usage: flutter list [arguments]
-h, --help       Print this usage information.
-d, --details    Log additional details about attached devices.

Run "flutter help" to see global options.

real    0m0.291s
user    0m0.263s
sys 0m0.049s

Perf is definitely a feature for us. Any advice on how to get the pub run case to be as fast as the pub global activate case, that would be a big help. We'd really like to get to a single package that contains our libraries and tools, but the startup time of pub run flutter --list is prohibitive.

~/tmp/flutterpkg$ pub --version
Pub 1.13.0-dev.7.1

Thanks for any advice!

sethladd commented 8 years ago

I think this is the command: https://github.com/flutter/engine/blob/55c6cb7efedc1f16cd31d1fb654aabfd0ca24927/sky/packages/sky/bin/flutter.dart

cc @abarth

nex3 commented 8 years ago

You're not doing anything wrong; there are just different constraints in these two cases that give rise to different performance. When you run a globally-activated application using its named executable, most of the time it's very lightweight: it doesn't invoke pub at all, it just runs a shell script that invokes dart on the executable snapshot.

When you run a dependency package, on the other hand, it does go through pub (which happens any time you type pub on the command line). Even in the best case, this means that the VM has to boot pub (about 180ms on my machine), pub has to locate the package (on my machine this takes another 200ms or so), and run a Dart subprocess on the VM's snapshot. And that's the best case; if the executable you're running is mutable—that is, it depends on a non-hosted, non-git package, including the package you're working on—we can't use the snapshot at all, and it gets a lot slower.

There's some work we can do to make this better. #1011 should help, for example. But a lot of it is just very small stuff that's difficult to avoid or optimize. We could theoretically add some logic to the pub wrapper in the SDK to invoke the executable directly, but that would be a lot of logic written in bash and batch and I certainly don't have the shell language expertise to do it transparently.

I think the best solution here long-term will be to add support for ahead-of-time compilation to the Dart VM (something that's been talked about before). This will (presumably) allow both pub and the dependency's executable to start up much faster.

sethladd commented 8 years ago

Thanks for the thorough explanation!

I wrote a small "what's the fastest path" script that doesn't the basics: figures out what package you are talking about, finds the bin directory, creates the right path, and spawns a process, and runs a script that just prints three lines. It's 0.241s on my mac:

~/tmp/testing/pkg1 $ time dart bin/find_and_run.dart pkg2 runme
from pkg2 1
from pkg2 2
from pkg2 3

real    0m0.241s
user    0m0.239s
sys 0m0.039s

Here's the script:

import 'package:package_config/packages_file.dart' as pkgs;
import 'dart:io';

main(List<String> args) async {
  var packageName = args[0];
  var commandName = args[1];

  var packagesFile = new File('.packages');
  if (!packagesFile.existsSync()) {
    print('did not find a .packages file. run this from the root of your project');
    exit(1);
  }

  var packages = pkgs.parse(packagesFile.readAsBytesSync(), Directory.current.uri);

  var packageLocation = packages[packageName];
  if (packageLocation == null) {
    print('sorry, never heard of $packageName');
    exit(1);
  }

  var filePath = packageLocation.toFilePath();
  filePath = filePath.substring(0, filePath.length - '/lib/'.length);
  var binPath = filePath + '/bin/' + commandName + '.dart';
  if (!new File(binPath).existsSync()) {
    print('sorry, did not find $binPath');
    exit(1);
  }

  var process = await Process.start(Platform.executable, [binPath]);
  stdout.addStream(process.stdout);
  stderr.addStream(process.stderr);
}

Of course, this is cutting a ton of corners because this isn't pub, just a very specific script.

Not sure we'll ever get much faster than that, in the most bestest case.

UNLESS... when installing a dependency, couldn't pub put snapshots of my dependencies into a location relative to my app?

Which, I just looked, and it did!

~/tmp/testing/pkg1 $ ls -l .pub/bin/pkg2/runme.dart.snapshot 
-rw-r--r--  1 sethladd  eng  569 Oct 21 07:59 .pub/bin/pkg2/runme.dart.snapshot

Running that takes only:

~/tmp/testing/pkg1 $ time dart .pub/bin/pkg2/runme.dart.snapshot 
from pkg2 1
from pkg2 2
from pkg2 3

real    0m0.092s
user    0m0.091s
sys 0m0.015s

Which, I dare say, is pretty dang good. Plus, it has the bonus that I don't need to manually manage my package-root (presumably, it would use a .packages).

It seems like we're close to giving our users a very fast experience. If we let them run the snapshots from the .pub dir (and/or, exposed those a little more) they would be really happy.

Has anyone thought of doing that?

nex3 commented 8 years ago

We do run snapshots from the .pub directory. That's exactly what pub run does. The issue is that doing so in full generality, handling all the edge cases correctly, takes time—about 200ms of time.

Running that takes only:

~/tmp/testing/pkg1 $ time dart .pub/bin/pkg2/runme.dart.snapshot 
from pkg2 1
from pkg2 2
from pkg2 3

real  0m0.092s
user  0m0.091s
sys   0m0.015s

Which, I dare say, is pretty dang good. Plus, it has the bonus that I don't need to manually manage my package-root (presumably, it would use a .packages).

We shouldn't get complacent; 90ms is actually pretty huge for a script that does practically nothing. For comparison, a script that does the same thing written in Ruby—not a language known for its speed—takes about 25ms on my machine* with no snapshotting at all.

*: I also checked a snapshotted Dart executable on my machine, and it takes about the same amount of time that it does on yours.

sethladd commented 8 years ago

The issue is that doing so in full generality, handling all the edge cases correctly

True, but I wonder if our case (simply run this bin that came from this dependency that's NOT a path dep) is a common case where we can short-circuit to the fast path (run the snapshot). That is, I think we could tell our developers:

  1. Add flutter to your dependencies
  2. Run pub get
  3. For all your flutter tool needs, run dart .pub/bin/flutter/flutter.dart.snapshot

Granted, that's a long command line and pretty gross, but it's fast. Of course, it would be better if pub put a wrapper script in my project, similar to the way it does during pub global activate. That would be great.

We shouldn't get complacent; 90ms is actually pretty huge for a script that does practically nothing.

That might be true, but that might the fastest way to run Dart app now (apparently?). We should definitely make it faster, though.

takes about 25ms on my machine

Yup, me too.

~/tmp/testing/pkg1 $ time ruby runme.rb 
from pkg2 1
from pkg2 2
from pkg2 3

real    0m0.037s
user    0m0.027s
sys 0m0.008s

Also, FWIW, running without a snapshot isn't much faster for this very small script:

~/tmp/testing/pkg1 $ time dart ../pkg2/bin/runme.dart 
from pkg2 1
from pkg2 2
from pkg2 3

real    0m0.112s
user    0m0.098s
sys 0m0.017s

Talking to Ivan, that's expected.. the size of the script is so small, snapshots don't buy you anything here.

nex3 commented 8 years ago

Users definitely shouldn't be running stuff directly from .pub. It's a hidden directory for a reason; its contents are implementation details and subject to change at any time.

Is performance here a significant problem? If so, we should allocate resources to solve it properly rather than asking users to work around it. Do we have anyone on the team with strong Bash and especially Windows Batch skills who can work with me to avoid invoking pub entirely if a snapshot exists and is up-to-date?

If we do start working on this, I know up front that we'll need dart-lang/sdk#20802 in order to make it work.

sethladd commented 8 years ago

Is performance here a significant problem?

Our users believe it is. It's not a P0 blocker, because we're still making progress, but we're not comfortable with this performance for our users (for a 1.0 product... for a beta product, it might be fine).

Do we have anyone on the team with strong Bash and especially Windows Batch skills who can work with me to avoid invoking pub entirely if a snapshot exists and is up-to-date?

This is, I believe, exactly the script that pub global activate creates. I could be wrong, but that script is resilient to changing Dart SDK versions. I wonder if we have everything we need, over in the pub global activate workflow.

https://github.com/dart-lang/sdk/issues/20802

That would be nice, but it doesn't appear to be a requirement. Our scripts generated by pub global activate seem to be working without that.

May I suggest that we re-open and mark as "help wanted". We can clearly do better with our performance for running scripts that come from dependencies. Maybe we can ask others to help here.

abarth commented 8 years ago

Maybe we need a pub local activate foo that makes a shell script specific to a given command line utility in the local package instead of globally? For example, the workflow could be:

$ pub local activate flutter
$ ./bin/flutter --help

Then pub could manage the bin/flutter shell script in a similar manner to the globally activated shell scripts.

sethladd commented 8 years ago

Maybe we need a pub local activate foo

Yup, that's what I'm thinking. I like that option, because I think we care steal---er, repurpose a lot from the pub global activate code. That's my hope, at least.

nex3 commented 8 years ago

This is, I believe, exactly the script that pub global activate creates. I could be wrong, but that script is resilient to changing Dart SDK versions. I wonder if we have everything we need, over in the pub global activate workflow.

The caching behavior with pub global activate is a lot simpler than the caching behavior we need in pub run—trust me, I wrote both of them :wink:.

For pub global activate, we can generate a bash/batch script that the user can then add to their path. This works because the executables are global—they don't depend on the current package's version constellation, and can co-exist in the global executable namespace largely without confusion. But pub run isn't like that; its executables are local to the current version constellation and even within that are explicitly namespaced.

In addition, pub run invokes the pub command. This is important for the consistency of the user experience: every executable gets invoked in the same way, whether it comes from the local package or a dependency, and no matter what barback mode it's being run in.

We could do something like @abarth suggests and write local scripts to disk, but this has a few problems. One of the biggest is just that it's ugly; it's generally bad form for a tool to incidentally write a bunch of extra user-visible files to disk, and we've heard consistently bad feedback about pub in particular doing so with packages/ directories (and even occasionally .pub). But also, pub just doesn't know what local executables are available. An executable in the local package or a path dependency may have been created or removed since pub get was run, causing the executables to get out of sync. If we just generate executables for cached dependencies or those for which pub local activate was explicitly run, then it's inconsistent with other ways of running executables as well.

This is why I say that if we want to solve this, we should commit the resources to solve it properly—finding some intermediate way of re-using the global activate logic isn't going to create a great user experience, nor is it necessary. We can make the existing pub run commands just as fast as a wrapper script by detecting the common case of invoking an up-to-date, cached executable with no complicated flags and just exec-ing it directly. All we need is someone who knows these scripting languages.

sethladd commented 8 years ago

Thank you for the detailed analysis. I would like to reopen this because we would like to see improved performance for the common-case of pub run. It sounds like we have some potential options, thus it's not unreasonable to improve the performance (with appropriate effort to solve it properly, as you mention).

May I please reopen?

nex3 commented 8 years ago

This issue doesn't doesn't describe a solution for the performance, so I'd rather not reopen it. If you'd like to open a new issue for making pub run circumvent pub in simple cases, feel free, but be aware that it's unlikely to go anywhere until we find someone with shell expertise who's willing to devote some time to working with me on it.

sethladd commented 8 years ago

This issue doesn't doesn't describe a solution for the performance

I was under the impression that I shouldn't be filing issues for solutions (that would presume too much), instead I should be filing issues that describe problems that users are having.

If you want me to file a request for a specific technical solution, I can do that, but I would encourage that description of a technical solution to come from our engineers.

nex3 commented 8 years ago

Okay, I'll file an issue.

sethladd commented 8 years ago

Thanks very much!

On Mon, Oct 26, 2015 at 3:28 PM, Natalie Weizenbaum < notifications@github.com> wrote:

Okay, I'll file an issue.

— Reply to this email directly or view it on GitHub https://github.com/dart-lang/pub/issues/1349#issuecomment-151302900.

nex3 commented 8 years ago

I filed this as dart-lang/sdk#24736 (against the SDK since that's where the wrapper scripts live).