Open KristofferC opened 8 months ago
Oh this looks very cool! Thanks for all the time/effort that's gone into this :star_struck:
One thing I'm slightly concerned about here is the approach taken to making sure that the executables are on the users's PATH
on Linux/BSD systems.
get_shell_config_file
, for instance I know multiple people using elvish, nushell (there was mention of this one on the Julia zulip too) as well as xonsh and oil to name a few. Also, isn't the fact we're relying on a hardcoded list a sign that there's something dodgy about this approach?zsh
the current implementation would fail to work on my machine, since I have no ~/.zshrc
. If the code changed to create a ~/.zshrc
it would be ignored, since I have set my ZDOTDIR
to ${XDG_CONFIG_HOME:-$HOME/.config}/zsh
.PATH
in shell rc files doesn't affect non interactive shell sessions. If we want Julia apps to extend to say graphical apps, this is particularly relevant as usually the launcher process is a child of the desktop environment, and either does not load the shell rc file or only loads it once at login. For this, you also want to potentially modify the shell env/profile/login shells. However, changes to those configurations are only loaded at login, so you'd also need to get the user to log out and log back in again for it to take effect.juliaup
write permissions to their shell config in order to successfully install juliaup
, and with one particular system configuration needing to broaden write permissions to all users.I see in the design document there is some mention of putting such files in a more standard location already on the path such as ~/.local/bin
, and that Cargo is mentioned in response to this. I think it's worth noting that there is a well-documented series of efforts (like this issue) to make Cargo more XDG-compliant (https://poignardazur.github.io/2023/05/23/platform-compliance-in-cargo/ does a good job outlining this, and describing a path forwards for Cargo). The Cargo discussion can essentially be summed up as "would have been good, but a bit late now".
Other lang's package managers already install things in the XDG-appropriate locations, such as Python with pip install --user
(new/alternative Python package managers like poetry
copy this behaviour).
I'd advocate for a ~/.local/bin
approach on Linux/BSD for these reasons. To programmatically determine which executables in ~/.local/bin
are managed by Julia, the executable files could be put inside a Julia-managed directory, and then symlinked to ~/.local/bin
. I think this approach keeps much of the benefits of the custom-bindir added to PATH
approach while avoiding the major pitfalls.
(NB: when I say ~/.local/bin
I really mean ${JULIA_BIN_DIR:-${XDG_BIN_DIR:-$HOME/.local/bin}}
, but that's a bit of a mouthful)
nice work! I'm wondering how apps are shared across Julia versions? e.g. are they isolated by Julia versions like how the global environment are setup?
intended to be run by the user as appname args to app. [..] It’s assumed that Julia is installed and serves as the “driver” to start up the app.
This seems useful, maybe already dispute that limitation. Could it be lifted by autoinstalling Julia (runtime, right version) for you if not available? Needs not be in first version.
This is in some ways similar to Python's zipapps (which I believe is not too popular, because runtime can't be assumed, even for Linux where it's most often preinstalled), that needs separate .pyz[w] file ending, and Python installed (and are in one archive file, optionally compressed):
https://docs.python.org/3/library/zipapp.html
There is no way to say “python X.Y or later”, so be careful of using an exact version like “/usr/bin/env python3.4” as you will need to change your shebang line for users of Python 3.5, for example.
[We already have AppBundler.jl if you want to bundle the runtime, it's best if you can have one way to make an app and it can be compiled with PackageCompiler, or use AppBundler, or a combining those..., or this system. ]
One thing I'm slightly concerned about here is the approach taken to making sure that the executables are on the users's PATH on Linux/BSD systems.
With regards to XDG there is an argument that Pkg should follow what Julia itself does. (As you are aware) there is https://github.com/JuliaLang/julia/issues/4630. juliaup
also uses this method of installing Julia and since juliaup is more or less the official way to install Julia it feels like if you have managed to install Julia itself, this should be fine. So there is a tension here between doing XDG (which some people would argue is the correct way) and to fit in how things are done everywhere else in Julia and its ecosystem.
A related question, according to XDG where should the .julia/environments/apps/Package
folder go?
For Windows the Cargo issue comment says:
For Windows, everything should go in ~/appdata/locallow or ~/appdata/local,since ~/.cargo is just a cache, AFAICT. This is FOLDERID_LocalAppData for SHGetKnownFolderPath, CSIDL_LOCAL_APPDATA for SHGetFolderPath, and %LOCALAPPDATA% in the environment.
How is that translated to all the files used here (shims, AppManifest.toml
, app environments)?
Other lang's package managers already install things in the XDG-appropriate locations, such as Python with pip install --user
I get
❯ pip install --user httpie
Requirement already satisfied: httpie in /Users/kristoffercarlsson/Library/Python/3.9/lib/python/site-packages (3.2.2)
~/Library/Python/3.9/bin
❯ ls
git-filter-repo http httpie https markdown-it pygmentize
nice work! I'm wondering how apps are shared across Julia versions? e.g. are they isolated by Julia versions like how the global environment are setup?
As it is right now each app entry in AppManifest.toml has an absolute path to a Julia installation. If you want to update that Julia version you would also resolve the environment. This ties into this later comment:
Could it be lifted by autoinstalling Julia (runtime, right version) for you if not available? Needs not be in first version.
one plan forward is to use Juliaup to install the Julia installation that the app is currently configured for if it does not exist. That way you would not store the absolute path to the julia installation like that.
With regards to XDG there is an argument that Pkg should follow what Julia itself does.
Right. I basically see Julia as currently being in a similar situation to Cargo — in that by the end of https://github.com/JuliaLang/julia/issues/4630 I think I can fairly summarise the consensus as "yes this would be nice to have, but it's going to be a hassle to start using it".
Much of the value of the XDG Desktop spec comes via a network effect. Thus when the Desktop spec was new and that issue was created in 2013, the benefit was somewhat speculative. Now though, as more tools use and assume XDG compliance, it creates a growing tension between the "Julia way" and the XDG way.
In this sort of light, I see decisions like this as opportunities to choose between digging down and digging out :stuck_out_tongue: somewhat. I still have loose plans to go back to https://github.com/JuliaLang/julia/issues/4630 to see if I can help move the state of affairs closer to XDG compliance (Stefan asked me if I'd be interested in putting a PR together a few months ago, and I am once I have fewer PRs currently open).
Considering the current "Julia way" and the XDG spec, would it not be possible to put things in ~/.julia/bin
as the "Julia-managed directory" that executables are written into, and make symlinks into ~/.local/bin
? I might well be missing something, but it seems to me that this way the current assumptions around ~/.julia/bin
hold but we also get the benefits of using the XDG-appropriate dir as outlined in my first comment.
A related question, according to XDG where should the .julia/environments/apps/Package folder go?
I made a flowchart for answering this sort of question in the BaseDirs.jl docs which might be helpful (it's not 100% accurate, but I didn't want to make it more complicated, and I think it gets 98% of the way).
If we classify .julia/environments/apps/Package
as:
then Data Home would be the relevant XDG Desktop component (let me know if any of those assumptions don't hold).
More generally, I find .julia/environments/
a bit interesting in that it's a mix of automatically-changed and user-modified environments. The v1.x
environments are changed when the user explicitly asks for a package to be installed/removed, and so line up best as "user configuration". However, you also have environments like __pluto_boot_v2_1.8.5
which are very much not, and probably best classed as user data.
For Windows the Cargo issue comment says:
For Windows, everything should go in ~/appdata/locallow or ~/appdata/local, since ~/.cargo is just a cache, AFAICT. This is FOLDERID_LocalAppData for SHGetKnownFolderPath, CSIDL_LOCAL_APPDATA for SHGetFolderPath, and %LOCALAPPDATA% in the environment.
How is that translated to all the files used here (shims, AppManifest.toml, app environments)?
A while ago I spent an inordinate amount of time looking at the relevant behaviour/specs/comments around directories on Windows/Mac. I think I'd probably be best off pointing you to the comparison table on https://tecosaur.github.io/BaseDirs.jl/stable/defaults/ (and if you want the reasoning/links to some of the most relevant resources: https://tecosaur.github.io/BaseDirs.jl/stable/others/).
Regarding just this part of the comment:
This is FOLDERID_LocalAppData for SHGetKnownFolderPath, CSIDL_LOCAL_APPDATA for SHGetFolderPath, and %LOCALAPPDATA% in the environment.
Yea, getting the right system dirs on windows is actually a bit of a pain. See https://github.com/tecosaur/BaseDirs.jl/blob/main/src/nt.jl for a glimpse of me not having a fun time.
one plan forward is to use Juliaup to install the Julia installation that the app is currently configured for if it does not exist. That way you would not store the absolute path to the julia installation like that.
My plan generally is that the Julia version in a manifest becomes the version selector for Juliaup. Presumably that would work well for apps here too?
What is still needed before this can be merged?
What is still needed before this can be merged?
In case folks haven't seen it - @KristofferC's talk from JuliaCon has a nice summary of the current status and what the open questions still are (or what they were as of a couple of weeks ago. Start at about 6:49:00 here: https://www.youtube.com/live/OQnHyHgs0Qo?si=IVg01oXigQw1JBDH&t=24545
It's great to see work on this continuing, I'd be interested to hear why the support for creating symlinks in the user's local bin-dir on Linux has been removed (requiring $PATH
shenanigans) though? :confused:
why the support for creating symlinks in the user's local bin-dir on Linux has been removed (requiring $PATH shenanigans) though
I tried to scale off as much as possible that isn't strictly needed to focus my efforts. It can always be added back at a later stage.
I tried to scale off as much as possible that isn't strictly needed to focus my efforts.
Righteo. I do see making modification of $PATH
a last-resort as rather important, given all the potential complications (incidentally there's a conversation that's just gone on in #hpc
on Slack about problems with path modification with the module
HPC application management system).
It can always be added back at a later stage.
If you would like any help doing so, I'd be happy to lend a hand.
I do see making modification of $PATH a last-resort as rather important, given all the potential complications (incidentally there's a conversation that's just gone on in #hpc on Slack about problems with path modification with the module HPC application management system).
That wast
I just realised switching between Python virtual environments (with venv) messes up with environment modules: when you deactivate a venv it restores the PATH at the time when environment was activated, but if in the meantime you loaded a module, then its PATH is gone
?. That doesn't seem to apply here, no? For example, I haven't heard people having had much issues with juliaup
even though it doesn't install in .local
.
That [wasn't]
To me the main takeaway from this conversation is fragility associated with path modification, which is the core of the module
+ python headache, and comes up a bunch in subsequent messages (e.g. "yeah, messing with PATH is so problematic" - Mose).
That doesn't seem to apply here, no? For example, I haven't heard people having had much issues with juliaup even though it doesn't install in .local.
Issues with the Juliaup approach do come up, a few from a quick search:
I believe Juliaup has started printing out instructions for users to modify the PATH
themselves in some cases. That said, even if the automatic shell startup file modification occurs without an error, depending on the particular content of the shell file, blindly append content may not be run. Then we've also got the increasing popularity of more exotic shells...
All in all, this is a can of worms I'd want to keep closed as much as possible.
_Edit: just to mention for fun, the Juliaup approach doesn't work on my system either, but for different reasons again to those I've listed above :upside_downface:
Hi Folks, I wanted to weigh in from the perspective of HPC.
If I understand this PR correctly, then the strategy chosen is to control the user environment in such a way that Julia code, Pkg environment, and default entrypoints emulate a user experience similar to a compiled executable.
This is like the approaches taken by Python zipfiles, anaconda, etc. Our experiences in running HPC systems (serving up to 10k users) so far has shown that this approach is:
$HOME
, and friends tend to live on shared file systems which essentially serialize metadata I/O on application launch.Basically: we are developing HPC-native container runtimes precisely because the approach chosen in this PR performs poorly for Python. The irony here is that this considerable engineering effort is only necessary because Python can't generate compiled code.
Therefore, I think that the motivation behind this PR -- while well intentioned -- might run a real risk at being harmful to Julia as a High-Productivity HPC language. Especially because efforts to build executable applications appears to be within reach for Julia: https://github.com/JuliaLang/julia/pull/55047. Furthermore, since JIT compilation adds complexity to the container build process, this should prove to be a much more seamless and scalable solution to building Julia applications than Pkg Apps.
Also, I think an approach to Julia applications that is based on compiled executables (which could be placed in any reasonable location on the filesystem) would result in a better user experience (including for non-HPC users). When developing tools to be used by others, I have opted for compiled executables as they don't rely on the user's runtime environment. This PR implicitly promises to support every edge case which a user could configure into their favorite shell, so from a mere user support perspective I think a combination of https://github.com/JuliaLang/julia/pull/55047 + a distribution mechanism would be much easier to maintain.
Let me know what you think. I am happy to contribute some of my time to this.
Citing @Seelengrab , @giordano, and @tecosaur : we had a conversation on Slack that brought this to my attention (this does not imply that they share or endorse my opinion)
Citing @vchuravy, @giordano, and @timholy: we had a conversation on Slack that brought this to my attention (this does not imply that they share or endorse my opinion)
I should point out that you were probably talking to me, not @vchuravy - different Valentin!
Citing @vchuravy, @giordano, and @timholy: we had a conversation on Slack that brought this to my attention (this does not imply that they share or endorse my opinion)
I should point out that you were probably talking to me, not @vchuravy - different Valentin!
Ha! Let me fix the citation! Sorry for the mixup.
And the Tim(othy) who chimed in later is @tecosaur, not @timholy :sweat_smile:
And the Tim(othy) who chimed in later is @tecosaur, not @timholy 😅
Goddammit! I should stop with the late night posts (I hate having incomplete todos when going to bed)
Let's hope there is no @giordano doppelganger also....
If I understand this PR correctly, then the strategy chosen is to control the user environment in such a way that Julia code, Pkg environment, and default entrypoints emulate a user experience similar to a compiled executable.
This is like the approaches taken by Python zipfiles, anaconda, etc. Our experiences in running HPC systems...
First, let me say that I agree with many of the limitations that you mention, and as someone that also works on HPCs (though not at the same level), I'm glad to have people thinking about this stuff.
At the same time, I don't think we want to let the perfect be the enemy of the good here. Julia already has a lot of advantages over python when it comes to platform independence (eg binary builder), and this PR as it stands has functionality that will be extremely useful in many contexts, even if it's not perfect for HPCs at the moment, requires a bit of extra work for users to modify their own paths, etc.
I agree that we should not rely on Julia managing path stuff, but IIUC, this PR explicitly doesn't do that - it puts stuff in a julia-managed directory and relies on users to deal with it from there. This is how cargo
does it too, and I've been able to use lots of those programs on my HPC.
I agree we want to move towards a place where Pkg can build binaries and seamlessly integrate them into the system environment, but I think that can be built on top of this, and I for one do not want to wait for that ideal state to get access to this functionality.
Another counter-point to https://github.com/JuliaLang/julia/pull/55047 as an alternative viable solution: not all Julia programs can be made free from dynamic dispatch, yet https://github.com/JuliaLang/julia/pull/55047 requires that for the programs it generates. In particular, if that were our only application deployment solution, then any program which uses Dagger.jl (which contains a dynamic-dispatch based core) would not be able to be deployed as an application, and thus users would be driven away from using Dagger in their applications if it meant that they lost application support by using Dagger. That would be a pretty harmful force to have exist within our ecosystem, as dynamic dispatch serves a very useful purpose, especially when used with care to ensure it doesn't result in unnecessary slowdowns in fast-paths.
To make things more concrete, could concerned HPC developers try out this PR on their system of interest and see where any issues arise? In particular, can we identify any currently-existing pain points in the current implementation that would make it hard for application/package authors to adopt this feature in their application? That would ground the discussion around things that we can clearly identify as issues to be resolved in some way, rather than trying to throw out this idea in its entirety on the basis of known unknowns.
I think this is a really good discussion to have, so thank you @kescobo and @jpsamaroo for your comments.
The discussion involving creating symlinks (which do a number on shared file system metadata servers at scale) and $PATH
shenanigans (which often run afoul of HPC env management systems like modules) worries me. Cf. @tecosaur's comments above. Anaconda does both, requiring substantial support efforts from the admins. This isn't alarmist: one of the systems I run on requires your environment to be untarred to /tmp
on the computes!
My main pain points at the moment are:
Cargo doesn't have these problems because rust builds statically linked (to a point) executables. So you can go and run cargo install
, then sbcast
the executable. You can't do the same for Julia. The generally accepted approach to launching many-file applications is containers. Julia has problems with HPC containers because of two reasons:
Going forward I therefore recommend:
PATH
shenanigans then that's a step in the right direction, but I also recommend that apps and their dependencies be relocatable (i.e. they get their own JULIA_DEPOT, or a way to easily traverse the manifest to "extract" any dependencies).This would allow users to deploy Pkg Apps and relocate them if necessary (e.g. relocating them into a container) to deploy at scale. Testing this could be combined with https://github.com/JuliaLang/julia/issues/53810
The solution described in https://github.com/JuliaLang/julia/pull/55047 is a stronger form.
@jpsamaroo can you explain more. Surely a program that depends on Dagger.jl doesn't constantly update the precompile cache?
I think this discussion is really in the weeds.. When using this to install "Julia package apps", this feature is basically just a convenient way for a user to run foo my args
in the terminal instead of having to install the Foo
package in a separate environment and running julia --project=@Foo -e 'using Foo; Foo.main(ARGS)' my args
.
All of the arguments made above apply the same to normal Julia packages (which is obvious since they are basically the same thing).
If you want to talk about static compilation of Julia and these types of things then this is not the right place. This is just a small installer that makes something available to you on the command line. How that thing is implemented is not determined or implemented here. If there are julia packages that can be statically compilable (juliac style) then those can be integrated into this installer.
Stepping out of the weeds for a bit: @KristoffeeC if that is all, then that's probably fine. I am assuming Pkg Apps won't need:
PATH
and LD_LIBRARY_PATH
, etc). Optional modifications are fine, I just need to be able to turn them off.JULIA_DEPOT
lives (e.g. you won't start loading something like ~/.juliarc
that just has to live in $HOME
for some reason).The thing I'm looking for is some sort of assurance that this will not start taking over the shell environment, or limit site customizations (the way anaconda does).
Really important is that Preferences
still work the same way. E.g.: If I have a LocalPreferences.toml
in a project higher on the JULIA_LOAD_PATH
, do Apps still pick that up.
Modifying PATH was one of the things Terence Tao recently pointed out as bothering him about the way Python does things :grimacing:
This is just a small installer that makes something available to you on the command line
I think people above are aware of that - is this the right place to discuss the design of that installer (e.g. that the installer should avoid requiring a modification of PATH for the installed apps to work)?
Optional modifications are fine, I just need to be able to turn them off.
The shims that start up the application (and do have to be in PATH
) have to go somewhere. That exact location could be e.g. overridable.
Assumptions about where the JULIA_DEPOT lives (e.g. you won't start loading something like ~/.juliarc that just has to live in $HOME for some reason
It will install packages and modify environments etc to JULIA_DEPOT[1]
just like when installing julia libraries.
Really important is that Preferences still work the same way. E.g.: If I have a LocalPreferences.toml in a project higher on the JULIA_LOAD_PATH, do Apps still pick that up.
I haven't looked into how stacked preferences work. They surely cannot be dependent on having e.g. the v#.#
environment in your LOAD_PATH
? That would be awful.
I think people above are aware of that
Comments like
Also, I think an approach to Julia applications that is based on compiled executables (which could be placed in any reasonable location on the filesystem) would result in a better user experience (including for non-HPC users). When developing tools to be used by others, I have opted for compiled executables as they don't rely on the user's runtime environment.
made me doubt that.
The shims that start up the application (and do have to be in PATH) have to go somewhere. That exact location could be e.g. overridable.
As long as it's overridable / controllable by sysadmins that's fine then. We can worry about the mechanisms later (CUDA.jl and MPI.jl have been great at listening to what the HPC community needs, and tweaking their control surfaces accordingly).
It will install packages and modify environments etc to JULIA_DEPOT[1] just like when installing julia libraries.
We already control these, so that's great. I am hearing that we don't need to start rethinking Julia support in this regards.
They surely cannot be dependent on having e.g. the v#.# environment in your LOAD_PATH? That would be awful.
They don't. For example, this is what we do on Perlmutter:
append_path("JULIA_LOAD_PATH",":/global/common/software/nersc/n9/julia/environments/1.9.4/gnu")
And at that location we only put preferences:
blaschke@perlmutter:login25:/global/common/software/nersc/n9/julia/environments/1.9.4/gnu> ls
LocalPreferences.toml Project.toml
blaschke@perlmutter:login25:/global/common/software/nersc/n9/julia/environments/1.9.4/gnu> cat LocalPreferences.toml
[MPIPreferences]
_format = "1.1"
abi = "MPICH"
binary = "system"
cclibs = ["cupti", "cudart", "cuda", "sci_gnu_123_mpi", "sci_gnu_123", "dl", "dsmml", "xpmem"]
libmpi = "libmpi_gnu_123.so"
mpiexec = "srun"
preloads = ["libmpi_gtl_cuda.so"]
preloads_env_switch = "MPICH_GPU_SUPPORT_ENABLED"
[MPICH_jll] libmpi_path = "/opt/cray/pe/mpich/8.1.28/ofi/gnu/12.3/lib/libmpi.so"
[CUDA_Runtime_jll] local = "true" version = "12.2"
So the test here is to see if an Pkg App picks up on any stacked perferences.
> made me doubt that.
You where right to doubt -- my initial sense was that there is going to be a whole new kind of Julia workflow that we would have to support.
This is quite heavily WIP towards having "app" support in Pkg. An app is a program that you just write its name in the terminal and it starts up, without explicitly having to invoke Julia, load the package, and call a function. Every app has an isolated environment.
More details of the design can be found in this hackmd: https://hackmd.io/r0sgJar5SpGNomVB8wRP_Q
This PR requires https://github.com/JuliaLang/julia/pull/52103
Here is some example usage:
cc @MasonProtter, @roger-luo