Room to optimize fast path for Haskell shell scripts run via stack?

rrnewton commented 8 years ago

I absolutely love being able to make self contained shell scripts like this:

#!/usr/bin/env stack
-- stack --verbosity silent --resolver lts-3.8 --install-ghc runghc  --package turtle --package filemanip --package optparse-applicative
module Main where
import Turtle
main :: IO ()
main = putStrLn "hello world"

But the above hello world takes up to 2 seconds to run on my laptop, even after the first run where the installs occur. Could we get a description of what the fundamental work to do in this scenario is? Which parts of it must be slow, and which parts may have room for improvement?

For example, I would hope the fast path would basically say:

is snapshot lts-3.8 there already?
are the requested packages installed already? (3 checks)
go!

And it sounds like those should be O(1) operations, checking only on a few packages.

I'm guessing the current algorithm is more conservative, running some kind of O(N) sanity check over, e.g. all the transitive dependencies? If it's something like that, should we work on a fast or trusted mode where only the constant time checks are done?

rubik commented 8 years ago

Probably somewhat related to #1235.

soenkehahn commented 8 years ago

Note that stack runs scripts like this in interpreted mode, which is inherently slower than running a compiled executable. My guess would be that the most time is spent in loading up ghc and type-checking the script.

There is a replacement for runghc that solves this problem by compiling scripts to machine-code and caching the results: http://hackage.haskell.org/package/runghc. I wonder if we want something like this in stack.

This caching is not strictly better than the interpreted mode however, since compiling the program to machine code can be slower than interpreting, in case the cache is out of date. This can be quite annoying when developing a script. (Since you will modify the file constantly and always invalidate your cache through that.)

rrnewton commented 8 years ago

That hypothesis is easy to test. runghc on the script above takes 400ms on my (wimpy) laptop right now, whereas it still takes 2s under stack.

At the same time, sure, for it would be great to transparently compiler scripts sometimes. This would certainly help with throughput, and since 400ms is ok but not great, maybe it could help there too. But you'd figure that for stack the latency would be similar to stack build.

mgsloan commented 8 years ago

@rrnewton Are you on the latest stack? 1.1 comes with some performance enhancements, namely, avoiding loading the hackage index when it's unneeded. More performance enhancements in the pipeline :) (replacing binary with something faster)

rrnewton commented 8 years ago

Ah, no I hadn't updated to 1.1. I did just now and do see some perf improvement. That hello-world script drops from 2.0s to 1.5s on this machine.

da-x commented 8 years ago

I'd also be happy for support in caching compiled exes for Stack scripts! It would be the only way to compete with the sub-millisecond bash script execution.

sboosali commented 7 years ago

any update? this seems like a good idea.

rrnewton commented 7 years ago

It looks like things may have already improved substantially. These days, on stack 1.4.0, that hello world script is taking only 700ms. But then again, with time stack --resolver=lts-3.8 runghc hello.hs, it takes only 300ms. So it looks like 400ms are still spent in the "checking" part, before "doing" begins.

rrnewton commented 7 years ago

@da-x for bettor or worse, even though stack scripts are not full blown projects (stack.yaml, .cabal, etc), they can import other Haskell files. So that means that if we want to cache exes for scripts, we have to reliably find all the source code imported by the script and include it / hash it when we go to do the cache lookup.

sboosali commented 7 years ago

huh, still takes 1.1s for me, then 1.7s when adding a few language extensions and (unused) import statements, to print a "hello". either way, it does seem like only half of the delay is from runhaskell (or runghc? not sure about how they differ), but both halves (say 500ms) are pretty slow (for a script).

https://pastebin.com/JjDBg2et

https://pastebin.com/tXmMv0rr

btw, any other alternatives besides tuning and caching? maybe a daemon, but that adds further complexity.

sboosali commented 7 years ago

anyways, i might also try nix for reproducible (haskell) scripts, but idk what it (nix-shell?) caches or its load speed.

crocket commented 7 years ago

-- stack --resolver lts-9.0 script --package turtle --optimize

stack script --optimize seems to add 400ms delay to compiled executables.

commercialhaskell / stack

Room to optimize fast path for Haskell shell scripts run via stack? #1330