JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.55k stars 5.47k forks source link

Shipped binaries are missing some codesigning on Windows & MacOS - Causing slow first launch #54366

Closed davidanthoff closed 1 month ago

davidanthoff commented 5 months ago

There is a delay of more than 14 seconds when I press ] to enter package mode on a new install. Here is a video that shows the timing: https://www.youtube.com/watch?v=a7PU7C1rMvw.

One strange thing is that on subsequent attempts (even after a boot of the computer) the delay is almost entirely gone. Not entirely straightforward on how to reproduce this, but uninstall 1.11 via juliaup rm 1.11 and then booting the computer mostly gets my system back into a state where the delay is present. The system where this is happening is a new desktop computer with essentially top specs for everything.   We (@KristofferC, @IanButterworth and I) on Slack where one hypothesis is that this is somehow related to Windows Defender scanning files on a first load, caching the results of that somehow... If that is true, maybe https://github.com/JuliaLang/julia/issues/54365 might help. While I'm not sure, Windows Defender might take less time to check files that have proper digital signatures on them.

KristofferC commented 5 months ago

Someone that can repro this could probably get it in a profiler and see if it is waiting on some syscall or something.

KristofferC commented 5 months ago

Also, is it really entering package mode that is slow or is it using Pkg?

KristofferC commented 5 months ago

I think this is too vague to be on the milestone. Also, on the video it seem to take 25 seconds to even start julia so I would not say package mode is the problematic thing here.

IanButterworth commented 4 months ago

On MacOS I'm noticing a lot more delay in first Pkg switch and first completion hint in pkg on nightlies via juliaup vs. julia built locally.

First pkg switch was like 4s via juliaup, ~0.3s via local build In a new session after that they're about the same at ~0.3s

So I guess MacOS also does some system check on dylibs on first execution too.

@davidanthoff I think it might make sense for juliaup to handle this kind of first execution system check. So launch julia and load all stdlibs after installation?


@KristofferC pointed to this which could be the MacOS issue https://news.ycombinator.com/item?id=23273247 Perhaps a solution would be what's discussed here? https://news.ycombinator.com/item?id=23273396

davidanthoff commented 4 months ago

Are we stapling that notarization stuff to the files in the tarball that Juliaup is using for MacOS?

My gut sense right now is that we first should just make sure that the archives that Juliaup is using have all the code signing/notarization that can be done, then check whether we still have performance problems, and if so, we can still think about more stuff that Juliaup could do.

IanButterworth commented 4 months ago

Side note, I noticed that on MacOS if you have a slow but active internet connection then precompilation, including building stdlibs, is really slow because MacOS is trying and failing to do the stuff mentioned here https://news.ycombinator.com/item?id=23273247

IanButterworth commented 3 months ago

I've adjusted the title to cover the issue that I think we should fix before 1.11 is released. First impressions matter and on the latest 1.11 rc1 via juliaup on MacOS the Pkg switch is really slow and it sounds like it can be a lot worse on Windows

IanButterworth commented 2 months ago

On MacOS it seems the issue is that we don't notarize the .tar.gz archive which Juliaup uses, just the .dmg.

If I install via the .dmg I see no delay in startup nor first Pkg switch.

With the .tar.gz I see a slightly slow initial startup, then ~5-10s for first pkg switch.

https://github.com/JuliaLang/julia-buildkite/blob/dea1b33fba992d460ffae69bce54f598330fb0d3/utilities/upload_julia.sh#L45-L53

then

https://github.com/JuliaLang/julia-buildkite/blob/dea1b33fba992d460ffae69bce54f598330fb0d3/utilities/macos/build_dmg.sh#L61-L69

@davidanthoff could it be something similar for Windows?

davidanthoff commented 2 months ago

Maybe? There seem to be a lot of possibilities ;)

I just checked, and 1.11-rc1 as installed via Juliaup has julia.exe signed, but not for example sys.dll. But I also don't know whether Windows checks signatures on stuff like sys.dll. But my gut feeling is that it can't hurt to just sign all dlls that are inside that tarball.

I should also say julialauncher.exe and juliaup.exe are also not signed (at least the shipping versions). So, if this is a problem, then any startup delay could also come from that, while I would assume that delays like switching into package mode are probably unrelated to that? On Juliaup main I now try to sign juliaup.exe and julialauncher.exe, but not sure how successful that is at the moment...

Just for this issue, it probably makes sense to directly launch the Julia binary from the ~/.julia/juliaup folder, rather than use the Juliaup launcher, just to isolate things a bit.

IanButterworth commented 2 months ago

Regarding MacOS I'm trying https://github.com/JuliaCI/julia-buildkite/pull/369 but I don't have high hopes because it doesn't seem possible based on online discussion to staple a notarization to a .tar.gz

Would it be reasonable for juliaup to use the .dmg on MacOS?

davidanthoff commented 2 months ago

I don't know much about MacOS, but isn't there some way to attach the notarize thing to the actual binary file? Why would we notarize the tar ball? That is only ever seen by juliaup, it essentially downloads it and then immediately extracts it and only writes the extracted files to disc...

IanButterworth commented 2 months ago

The docs are a little unclear on this, but I had thought somehow MacOS was retaining knowledge about the uncompressed files after untar-ing.

Regarding the executables specifically, from what I gather it's not possible to "staple" the notarization ticket to the executable. So gatekeeper will do an internet check for the ticket on first load (what's happening).

The only way to avoid the slow online check is to staple it to the .dmg, which MacOS handles specially when unpacking it. So Juliaup needs to use the .dmg to avoid the check & delay.

@staticfloat does that sound right to you?


Some references https://forums.developer.apple.com/forums/thread/114961 https://keith.github.io/xcode-man-pages/stapler.1.html

DilumAluthge commented 2 months ago

I think it might make sense for juliaup to handle this kind of first execution system check. So launch julia and load all stdlibs after installation?

This seems like the easiest approach, right?

davidanthoff commented 2 months ago

This seems like the easiest approach, right?

Yes, after reading around, I agree. All we really need to do is spawn a julia process that loads everything right after we unpacked, right?

IanButterworth commented 2 months ago

It doesn't save the user time but it does hide it in the juliaup process, and makes first julia impression better so it would be an improvement, yeah.

davidanthoff commented 2 months ago

Another benefit is that by definition there should be a working internet connection when a new version is added.

davidanthoff commented 2 months ago

I just checked, and the rc2 Windows binaries that we get via Juliaup still don't have any of the dlls code signed. Neither the ones in the bin folder, nor the sysimage sys.dll. I don't know whether that would really make a difference, but it seems a very simple first step to try and something we should do no matter what, IMO. I don't know who/how to do that, though... @staticfloat would presumably know? Or @DilumAluthge ?

IanButterworth commented 2 months ago

https://github.com/JuliaCI/julia-buildkite/blob/747ec36f50e816871364680cbd442b4e14632a2a/utilities/upload_julia.sh#L73-L85

staticfloat commented 2 months ago

I've been trying to do something in https://github.com/JuliaCI/julia-buildkite/pull/374 but I'm getting errors from signtool. I'm running commands such as:

signtool sign /debug /fd SHA1 /f 'C:\workdir\.buildkite\secrets\windows_codesigning.pfx' /p '[REDACTED]' /t 'http://timestamp.digicert.com/?alg=sha1' julia-0ef8a91e49/bin/julia.exe

but it raises errors like:

SignTool Error: No file digest algorithm specified. Please specify the digest algorithm with the /fd flag. Using /fd SHA256 is recommended and more secure than SHA1. Calling signtool with /fd sha1 is equivalent to the previous behavior. In order to select the hash algorithm used in the signing certificate's signature, use the /fd certHash option.

I tried changing it to /fd certHash, but that didn't help. I'm kind of at a loss as to what's going on here.

davidanthoff commented 2 months ago

Hm, I guess the other question is how much effort to put into signtool if we are going to switch over to the trusted signing option soon...

davidanthoff commented 1 month ago

Is there an easy way for us to test whether the signed Windows binaries make a difference? Cutting a new rc would be one way, but that is a pretty involved process, right?

DilumAluthge commented 1 month ago

We codesign nightlies, so you should just be able to grab the Windows nightly from the Buildkite page for the latest commit on master.

IanButterworth commented 1 month ago

Regarding macOS, I am concluding that we need to ship the .dmg via juliaup because it's properly notarized. See https://github.com/JuliaLang/juliaup/pull/1006#issuecomment-2283691291

davidanthoff commented 1 month ago

Nightly seems a bit broken, I'm getting errors when I'm just trying to enter the REPL mode, so I can't really test whether this helped or not...

KristofferC commented 1 month ago

I'm getting errors when I'm just trying to enter the REPL mode

Elaborate please....

davidanthoff commented 1 month ago
> julia +nightly
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.12.0-DEV.1040 (2024-08-12)
 _/ |\__'_|_|_|\__'_|  |  Commit cf4c30accd (0 days old master)
|__/                   |

(@v1.12) pkg> Unhandled Task ERROR: ArgumentError: Package REPLExt [e5eb5ef1-03cf-53a7-ae1d-5a66b08e832b] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.

Stacktrace:
  [1] _require(pkg::Base.PkgId, env::Nothing)
    @ Base .\loading.jl:2413
  [2] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
    @ Base .\loading.jl:2291
  [3] #invoke_in_world#3
    @ .\essentials.jl:1082 [inlined]
  [4] invoke_in_world
    @ .\essentials.jl:1079 [inlined]
  [5] _require_prelocked
    @ .\loading.jl:2278 [inlined]
  [6] _require_prelocked
    @ .\loading.jl:2277 [inlined]
  [7] macro expansion
    @ .\loading.jl:2577 [inlined]
  [8] macro expansion
    @ .\lock.jl:273 [inlined]
  [9] require_stdlib(package_uuidkey::Base.PkgId, ext::String)
    @ Base .\loading.jl:2532
 [10] macro expansion
    @ C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\Pkg_beforeload.jl:8 [inlined]
 [11] macro expansion
    @ .\lock.jl:273 [inlined]
 [12] load_pkg()
    @ REPL C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\Pkg_beforeload.jl:7
 [13] (::REPL.var"#123#141"{REPL.LineEdit.MIState, REPL.LineEditREPL, REPL.LineEdit.Prompt})()
    @ REPL C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\REPL.jl:1406
┌ Error: Error in the keymap
│   exception =
│    ArgumentError: Package REPLExt [e5eb5ef1-03cf-53a7-ae1d-5a66b08e832b] is required but does not seem to be installed:
│     - Run `Pkg.instantiate()` to install all recorded dependencies.
│
│    Stacktrace:
│      [1] _require(pkg::Base.PkgId, env::Nothing)
│        @ Base .\loading.jl:2413
│      [2] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│        @ Base .\loading.jl:2291
│      [3] #invoke_in_world#3
│        @ .\essentials.jl:1082 [inlined]
│      [4] invoke_in_world
│        @ .\essentials.jl:1079 [inlined]
│      [5] _require_prelocked
│        @ .\loading.jl:2278 [inlined]
│      [6] _require_prelocked
│        @ .\loading.jl:2277 [inlined]
│      [7] macro expansion
│        @ .\loading.jl:2577 [inlined]
│      [8] macro expansion
│        @ .\lock.jl:273 [inlined]
│      [9] require_stdlib(package_uuidkey::Base.PkgId, ext::String)
│        @ Base .\loading.jl:2532
│     [10] macro expansion
│        @ C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\Pkg_beforeload.jl:8 [inlined]
│     [11] macro expansion
│        @ .\lock.jl:273 [inlined]
│     [12] load_pkg()
│        @ REPL C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\Pkg_beforeload.jl:7
│     [13] (::REPL.var"#113#131"{REPL.LineEditREPL})(s::REPL.LineEdit.MIState)
│        @ REPL C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\REPL.jl:1311
│     [14] on_enter(s::REPL.LineEdit.MIState)
│        @ REPL.LineEdit C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:2332
│     [15] (::REPL.LineEdit.var"#120#176")(::REPL.LineEdit.MIState, ::Any, ::Vararg{Any})
│        @ REPL.LineEdit C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:2482
│     [16] #invokelatest#2
│        @ .\essentials.jl:1048 [inlined]
│     [17] invokelatest
│        @ .\essentials.jl:1045 [inlined]
│     [18] (::REPL.LineEdit.var"#30#31"{REPL.LineEdit.var"#120#176", String})(s::Any, p::Any)
│        @ REPL.LineEdit C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:1723
│     [19] macro expansion
│        @ C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:2873 [inlined]
│     [20] macro expansion
│        @ .\lock.jl:273 [inlined]
│     [21] (::REPL.LineEdit.var"#282#284"{REPL.Terminals.TTYTerminal, REPL.LineEdit.ModalInterface, REPL.LineEdit.MIState, ReentrantLock, REPL.LineEdit.Prompt})()
│        @ REPL.LineEdit C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:2863
└ @ REPL.LineEdit C:\Users\david\.julia\juliaup\julia-nightly\share\julia\stdlib\v1.12\REPL\src\LineEdit.jl:2875
KristofferC commented 1 month ago

With the juliaup notarization fix on 1.11 and the signing of the windows libraries I will remove this from the milestone.

IanButterworth commented 1 month ago

Sounds good. Minor detail that we would also need a proper release of juliaup. It's currently just a prerelease

davidanthoff commented 1 month ago

With the juliaup notarization fix on 1.11 and the signing of the windows libraries I will remove this from the milestone.

Well, we have no idea whether these fixes actually help with the problem or not, certainly not on Windows, the signing was pure speculation. I think if the idea of putting this on the milestone was that we want to fix the problem, then we should first validate whether the signing actually solved the problem or not.

I would try to validate this, but nightly is broken for me and I don't think I have access to any other fully signed Windows build.

IanButterworth commented 1 month ago

Try deleting your default env manifest?

davidanthoff commented 1 month ago

There is none for 1.12 on my system.

davidanthoff commented 1 month ago

I just tried this with 1.11-rc3, and there was no delay whatsoever on Windows!

I think we can rule out any caching for my experiment: before I ran juliaup up to get rc3, I booted my system, then I just directly ran juliaup up, and then julia +rc. This was the first time the rc3 bits ever touched my system. Everything was fast with no delays.

I guess we'll never really know what fixed things, but we now have both the Juliaup binaries and all the Julia binaries (including dlls and in particular the sys.dll) signed on Windows, so maybe that really did make all the difference.

@IanButterworth can you confirm that with rc3 things are also smooth on Mac? In which case we could close this issue, I'd say.

IanButterworth commented 1 month ago

Yep, after the notarization check Pkg loaded fast first time.

Hopefully people will be updated to the latest juliaup before trying out 1.11.0

davidanthoff commented 1 month ago

Hopefully people will be updated to the latest juliaup before trying out 1.11.0

Juliaup auto-updates aggressively. I think the vast majority of folks should get any new version within 24 hours after I push it out, and this version of it was published 1-2 weeks ago, so I think that will easily work out.