JuliaLang / Pkg.jl

Pkg - Package manager for the Julia programming language
https://pkgdocs.julialang.org
Other
621 stars 268 forks source link

Applications: what are they? #1962

Open StefanKarpinski opened 4 years ago

StefanKarpinski commented 4 years ago

I want to start an official discussion of what makes a project an application here.

Roger-luo commented 4 years ago

I think this is a bit related to what I was working on. CLIs can be one kind of the applications, since in principal they can be built in a standalone environment using PackageCompiler as an app, and then installed to .julia/bin, but CLIs doesn't have to contain a bin folder in the project folder, since that can be generated. It'd be nice if Pkg can manage this use case (either install or uninstall)

KristofferC commented 4 years ago

Posting this discussion here: https://discourse.julialang.org/t/julia-bin-as-a-standard-location-for-scripts/45993.

Some quick thoughts/ideas. The exact name of stuff are not that important.

  1. An application is a Julia project (note, not a package) that comes with bin/foo_1.jl, bin/foo_2.jl files that all defines a main(args::Vector{String}) function.
  2. The project file should list all the "entry points" like apps = ["foo_1", "foo_2"]. These are the names of how a user would run the app (./foo_1) and correspond to the file in bin. Each app can therefore define multiple "executables".
  3. An application is required to have a Manifest file. The resolver is in fact unused when it comes to installing applications.
  4. Because of point 2, an application does not have any compat info in the registry.
  5. An application is installed with app add App.
  6. Similar functions like app status app rm are available.
  7. The currently active project is irrelevant for apps since they are self-contained and run in their own process.
  8. Upon installation, the app is put into apps/$slug (or maybe we should just keep using the packages directory) and for each entry in apps a wrapper file is put into .julia/bin that will run the app. For example .julia/bin/foo_1 will start julia and pass the arguments to apps/$slug/bin/foo_1.jl. People are recommended to add .julia/bin to their PATH (perhaps Pkg should have some convenience tool to add that to .bashrc etc.)
  9. You cannot depend on an app.

Qustions:

  1. What happens with name collisions in .julia/bin?
  2. How do you "dev" an app? Just clone it and then you manually run julia bin/foo_1.jl?
  3. How do we GC?
  4. What julia version is used to run all the apps? Perhaps just julia by default overridable by an env and an explicit argument to the app?
  5. Should there be some way to pass standard julia arguments to the app (like --optimize, and --check-bounds).
  6. How should Julia compat be handled if there is only one manifest. Maybe that is too strict and we should in fact resolve dependencies to the app?
aviks commented 4 years ago

I've been thinking about this for a while, so herewith some comments. I agree with most of https://github.com/JuliaLang/Pkg.jl/issues/1962#issuecomment-686447390 except:

Some other random comments

KristofferC commented 4 years ago

For example, consider an image editor that runs plugins. A simpler but more concrete example is GameZero, which behaves as an app in that it has a command line game runner. But the game it runs is a julia code, with it's own set of dependencies.

Maybe there should be a library version of GameZero then like GameZeroLib and an eventual GameZero application. I think that running apps from the terminal but at the same time have them be bound to some external environment will be very confusing. Like, if we do things correctly, you aren't even supposed to know that you are running Julia code.

fredrikekre commented 4 years ago

I was actually thinking about this this morning, funny coincidence. This is what I had in mind, and what I sketched up now:

$ cat MyApp/bin/myapp.jl 
println("""Welcome.
    This is myapp running with $(LOAD_PATH[1])
    as the environment with the following version of Example.jl:
    $(read(pipeline(`pkg st`, `grep Example`), String))""")

$ julia -e 'Applications; Applications.install("MyApp")'
[ Info: Installed `foo` to /home/fredrik/.julia/bin
[ Info: Installed `myapp` to /home/fredrik/.julia/bin

$ myapp
Welcome.
This is myapp running with /home/fredrik/.julia/scratchspaces/043fa9b4-a959-4035-b28b-58282c1da903/c8bb88aa-bf07-49d0-bccd-10f9f1c5c30c/0.1.0
as the environment with the following version of Example.jl:
  [7876af07] Example v0.5.1

whereas in the regular env

$ pkg st | grep Example
  [7876af07] Example v0.5.3

Edit: version 2:

$ myapp
Welcome.
This is myapp running with /home/fredrik/.julia/packages/MyApp/qaAkC
as the environment with the following version of Example.jl:
  [7876af07] Example v0.5.1
ffevotte commented 4 years ago

I entirely agree with Kristoffer's comment above.

Some thoughts about the remaining questions:


  • How do you "dev" an app? Just clone it and then you manually run julia bin/foo_1.jl?

I'd say yes, just clone it. Then with the minimal interface defined above:

julia -L bin/foo_1.jl -e 'main(ARGS)' -- arg1 arg2 arg3

Or possibly

julia bin/foo_1.jl -- arg1 arg2 arg3

if the scripts in bin additionally automatically run main(ARGS) when they are run as the main script


  • Should there be some way to pass standard julia arguments to the app (like --optimize, and --check-bounds).

I think this would be useful. Perhaps via environment variables? Such a feature would also be a way to support sysimages without too much effort from Pkg's side.

A related question is whether there would be a way for the app itself to control which command-line arguments are used for the Julia process that runs it. Apps could perhaps declare a preferred (poossibly overrideable) set of arguments in Project.toml


  • What julia version is used to run all the apps? Perhaps just julia by default overridable by an env and an explicit argument to the app?

I don't have any sensible answer to propose here, but just wanted to mention that this question is very important if we want to allow "executables" to support custom sysimages.

I guess what makes this particularly difficult is that some users delete older julia versions when they install a newer one. Others don't but update a set of symlinks in their PATH so that older versions remain accessible, but under an other name...

Maybe a minimal way to handle julia upgrades would be to provide a way for users to re-generate all wrappers in ~/.julia/bin (or at least check whether apps and their dependencies are still compatible with the newly installed julia version).

Roger-luo commented 4 years ago

I recently have been building some CLIs which I think are also "Applications". I agree with most of the comments above, but the current solution I'm using now is a bit different from what @KristofferC described, which might be interesting to consider.

An application is a Julia project (note, not a package) that comes with bin/foo_1.jl, bin/foo_2.jl files that all defines a main(args::Vector{String}) function.

In my case, the CLI projects are generated by the DSL Comonicon, it generates an entry function command_main in the package's module, and then generates two scripts in .julia/bin. This allows Julia to precompile the command_main(::Vector{String}) function so for simple CLIs we can have almost zero extra start-up time.

  1. command_name.jl this file simply using XXX; XXX.command_main()
  2. command_name this is shell script, it basically configs how one should call Julia compiler with some optimization options, a glance of how it may look like
#!/bin/sh
JULIA_PROJECT=/Users/roger/julia_code/IonCLI /Applications/Julia-1.5.app/Contents/Resources/julia/bin/julia \
    -J/Users/roger/julia_code/IonCLI/deps/lib/libion.dylib \
    --compile=min \
    -O2 \
    --startup-file=no \
    -- /Users/roger/.julia/bin/ion.jl $@

I guess potentially, I can also generate an entry file in bin/command_name.jl, but would be nice if I don't need to.

perhaps Pkg should have some convenience tool to add that to .bashrc etc

I implemented this in Comonicon: https://github.com/Roger-luo/Comonicon.jl/blob/master/src/tools/build.jl#L559 in case if someone wants it. But I find it's hard to cover all kinds of shells.

Regarding environments, I'm currently using the package environment directly with a committed Manifest.toml, I find this works pretty well in general (except https://github.com/JuliaLang/PackageCompiler.jl/issues/438, but it's not related)

Should there be some way to pass standard julia arguments to the app (like --optimize, and --check-bounds).

I think allowing developers to config in Project.toml would be nice (tho environment variables can always be a low priority choice), since it will make building system image easier in the CI and more consistent on users' device (so users don't need to set up environment variables themselves) e.g I'm currently using a custom config file Comonicon.toml: https://github.com/Roger-luo/IonCLI.jl/blob/master/Comonicon.toml It'd be nice if this can be included in Project.toml which makes the folder more compact.

How do you "dev" an app? Just clone it and then you manually run julia bin/foo_1.jl?

I think this currently works for me. but note in my case if I want to generate the entries in bin I need to run build before installation etc.

What julia version is used to run all the apps? Perhaps just julia by default overridable by an env and an explicit argument to the app?

I'm using the julia when the user install the application/package. so at least we can make sure this julia executable is compatible (guaranteed by Pkg) and can run/build the application.

What happens with name collisions in .julia/bin?

I think in rust, application names are registered so at least in the same registry you can't have two same application names.

KristofferC commented 4 years ago

How do you "dev" an app? Just clone it and then you manually run julia bin/foo_1.jl? I'd say yes, just clone it.

I thought about this some more and I don't think this is "correct". Firstly, we should have a "manifest" file that lists all the applications that are installed. This should be a TOML file and should contain all data that is necessary to reproduce the same set of applications on another machine. It could look something like

[[MyApp]]
executables = ["app_1", "app_2"]
git-tree-sha1 = "abcd-1232."
version = "0.1.0"

[[OtherApp]]
executables = ["ls", "cat"]
path = "/home/kc/juliapps/OtherApps
version = "1.1.0"

[[PkgEval]]
executables = ["pkgeval"]
git-tree-sha1 = "abcd-1232..."
repo-url = "..."
repo-rev = "master"
version = "1.1.0"

so in this case, running app_1 would load it from something like .julia/apps/$SLUG/bin/app_1.jl but in the OtherApp case it loads it from the given path. Note hos this pretty much exactly replicated how code loading work for the Manifest.

We could then have something like "instantiate" that takes this TOML file as an input and produces the correct wrappers in .julia/bin so that you can actually run things by just app_1 (assuming PATH is set). You could also then just send this file to another machine and get the same set of "apps" installed.

There would then be a set of Pkg commands that manages this file. For example:

pkg> app add RipGrepJL
Adding RipGrepJL....

pkg> app status
RipGrepJL @ 1.0.0 -> ripgrep
OtherApp @ 1.1.0 ["home/kc/juliapps/OtherApps"]  -> ls, cat
MyApp @ 0.1.0 -> app_1, 

pkg> app rm MyApp
...

pkg> app dev home/kc/juliapps/CoolApp2

Pkg always makes sure that the stuff in .julia/bin are updated.

KristofferC commented 4 years ago

This allows Julia to precompile the command_main(::Vector{String}) function so for simple CLIs we can have almost zero extra start-up time.

Yes, I agree that the application needs to support precompilation so it should probably be something like import App; App.app1() to run app1 in the application App instead of just executing bin/app1.jl. This is similar to what PackageCompiler is doing.

I think allowing developers to config in Project.toml would be nice

Do you mean that apps want to set custom julia argument options themselves. Maybe that should be in the project file then. Users could override them with something like:

./myapp --julia-args --check-bounds=no -O 3

where everything after --julia-args are passed as julia arguments.

I think in rust, application names are registered so at least in the same registry you can't have two same application names.

But we also want to be able to just add an application by url á la pkg> app add https://....

Roger-luo commented 4 years ago

I thought about this some more and I don't think this is "correct". Firstly, we should have a "manifest" file that lists all the applications that are installed. This should be a TOML file and should contain all data that is necessary to reproduce the same set of applications on another machine. It could look something like

@KristofferC does this mean per environment bin folder?

I feel the bin folder might be nice to have for per environment too, this could resolve part of the name conflict problem.

Do you mean that apps want to set custom julia argument options themselves. Maybe that should be in the project file then. Users could override them with something like

One potential issue is that if the application is itself a CLI application, these extra options might not be necessary and can make the already complicated CLI more complicated.

But we also want to be able to just add an application by url á la pkg> app add https://....

If it's not a registered package, maybe user should take their own risk? if we have per environment bin then maybe what they could do is just use a new environment. This is also consistent with the behavior of packages I think? But then on terminal side, we need to setup some shell environments that activates certain Julia environment via JULIA_PROJECT.

tkf commented 4 years ago

Here's my take on how to define an application:

The format of Application.toml is something like this

name = "MyApp"
uuid = "..."
authors = ["..."]
version = "X.Y.Z"

[executable.app_1]
package_name = "MyPackage"
package_uuid = "u-u-i-d"
entry_point = "app_1"  # i.e., `MyPackage.app_1(::Vector{String})` exists

[executable.app_2]
package_name = "MyPackage"
package_uuid = "u-u-i-d"
entry_point = "app_2"

I think it's nice to decouple how an application is specified and how it is implemented. For example, with this approach we can also derive in-REPL Pkg-like "CLI" from the above interface (i.e., just install MyPackage and call MyPackage.app_1(["arg1", "arg2"]) via cli> app_1 arg1 arg2). It also makes it easy to use a different mechanism to invoke an application (e.g., https://github.com/tkf/JuliaCLI.jl). Composing other applications (e.g., provide them through subcommands with pre-/post-processing) is easy this way. I think it also addresses some of the points @KristofferC raised:

2. How do you "dev" an app?

If we enforce that an application has no code, we can just dev a package that implements the application.

5. Should there be some way to pass standard julia arguments to the app (like --optimize, and --check-bounds).

If the entry point is easy to import as a function, you can just call it with julia (as already mentioned https://github.com/JuliaLang/Pkg.jl/issues/1962#issuecomment-686707220).

KristofferC commented 4 years ago

It does not contain any Julia code by itself. Each application entry point is implemented as a function in a package listed in the Manifest.toml file of the application.

I don't really understand this. It seems odd to me that a set of entry points can go into different packages and that they are in packages listed in the manifest and not the Project.toml/Application.toml. What if I don't want to register a package, the code my app executes might be only for the app and is not reusable as a library?

I think each application should have an associated module that can be individually precompiled and where app code can be put. Depending on the entry point you might want to load a different set of dependencies. So an app with multiple entry points would be structured something like

Application.toml
Manifest.toml
src/App1.jl
src/App2.jl
# Application.toml
name = "MyApp"
uuid = "..."
authors = ["..."]
version = "X.Y.Z"

[executable.App1]
# metadata

[executable.App2]
# metadata

[deps]
DepA = "..."
DepB = "..."
DepC = "..."
...
# src/App1.jl
module App1
using DepA
using DepB

main(args::Vector{String}) = ...

end # module
# src/App2.jl
module App1
using DepA
using DepC

main(args::Vector{String}) = ...

end  # module

The user facing entry points would be ./app1 and ./app2. Calling these will precompile the corresponding module and run the main function in that module.

tkf commented 4 years ago

the code my app executes might be only for the app and is not reusable as a library?

My point is that code is usually reusable. If the entry-point function does not call exit, I think the majority of "CLI" is actually totally usable as a library. Even if not, I think providing different ways to invoke the entry points (via Pkg.jl-like REPL, worker pools like JuliaCLI.jl, as sub-commands, etc.) is an important feature. I think distributing a concrete Manifest.toml for an app is a good idea. But, if the app code is not loadable outside a fixed Manifest.toml distributed with the app, being able to put any code in an app promotes Julia programmers to write non-reusable code (e.g., if the code cannot be loaded in REPL and it's hard to auto-create Pkg.jl-like UI).

That said, what I'm proposing is doable by suggesting

# src/App1.jl
module App1
using MyPackage
main(args) = exit(MyPackage.app_1(args))
end # module

as the best practice. I just thought that it's nice to encode the best practice as the default behavior.

KristofferC commented 4 years ago

My point is that code is usually reusable. If the entry-point function does not call exit, I think the majority of "CLI" is actually totally usable as a library.

Maybe, maybe not. If the user wants to not deal with the overhead of writing library like code that should be totally fine. Forcing all application code into a library might very well mean that nice tools don't get made because the author doesn't think the code makes sense in a library (and maybe it doesn't). One should be allowed to write cool and interesting applications that are not meant to be used as libraries without having to register some dummy library package.

tkf commented 4 years ago

register some dummy library package

Since Manifest.toml file can contain URL and relative path, I don't think that's necessary.

Forcing all application code into a library might very well mean that nice tools don't get made

Do you mind elaborating on this? Frankly, I don't understand why categorizing a certain set of files as "library" instead of "application" suddenly makes it hard to write exactly the same chunk of Julia code. This is especially so when you don't have to register the package.

KristofferC commented 4 years ago

Since Manifest.toml file can contain URL and relative path, I don't think that's necessary.

That's a good point and then the difference between library and application code becomes moot.

But then I don't see what you object with to e.g. the layout in https://github.com/JuliaLang/Pkg.jl/issues/1962#issuecomment-686932729.

Instead of

src/App1.jl
src/App2.jl

you want it to be

App1l/src/App1.jl
App2/src/App1.jl

and each "app" runs App1/src/App1.jl:main(args) so that someone can do dev/Apps/App1 and run the main function manually? That's a good idea then but it won't work immediately since dev/Apps/App1 doesn't have a project file. The reason I want to have a separate module for each entry point is that they can then be precompiled separately and only use the dependencies it needs.

Roger-luo commented 4 years ago

Just one related thought on this, if there is a registry for applications, then we can consider provide an option to compile the entire application as standalone executable on registry side, and even Julia itself can be one of the application on this registry. Thus should make download of a lot Julia toolchain become easier and user don't need to wait for precompile/compile locally.

tkf commented 4 years ago

But then I don't see what you object with to e.g. the layout in #1962 (comment).

The main point I wanted to emphasize was that having a package-compatible directory structure reduces the things Julia programmers have to learn and also infrastructures such as Pkg.jl have to implement. If the code lives in a separate directory or a URL, Julia programmers don't have to learn anything for knowing how to dev the code. Likewise, nothing has to be implemented in Pkg.jl for supporting adding for deving the application implementations (i.e., the packages; installing executables is another story, of course). This is also why I think we can maximize the "accidental reusability" of the application code as it's just a package.

it won't work immediately since dev/Apps/App1 doesn't have a project file

Actually, I want

Application.toml or Project.toml?
Manifest.toml
App1/Project.toml
App1/src/App1.jl
App2/Project.toml
App2/src/App2.jl

so that App1 and App2 are just normal packages that can be added and deved. Of course, creating App1/src/App1.jl, App2/src/App2.jl, corresponding Project.toml, etc. by hand is very tedious. I think this is a disadvantage of a package-based approach. But I think that's already more or less solved since we have PkgTemplates.jl.

I think it resembles the move from [extras]/[targets] to test/Project.toml. Just like test-only dependencies are recorded in test/Project.toml, it's straight forward to record a separate set of dependencies in App1/Project.toml and App2/Project.toml. Since the project is a minimal component in Julia's code organization, I think that reusing the concept of projects everywhere is a nice simple approach.

KristofferC commented 4 years ago

The main point I wanted to emphasize was that having a package-compatible directory structure reduces the things Julia programmers have to learn and also infrastructures such as Pkg.jl have to implement.

Ok, that is a valid point that I can agree with. In fact, PackageCompiler.jl apps are structured in exactly that way.

I'm not sure what it means to have multiple project files but one manifest file. How would the resolution process for that work? What happens to compat in the project files etc? What is in those project files?

KristofferC commented 4 years ago

we can consider provide an option to compile the entire application as standalone executable on registry side, and even Julia itself can be one of the application on this registry.

It would be cool but also require a significant amount of infrastructure and work. You would need buildbots for the different architectures, automatic hosting, a download page for apps etc.

tkf commented 4 years ago

I'm not sure what it means to have multiple project files but one manifest file. How would the resolution process for that work? What happens to compat in the project files etc? What is in those project files?

Hmm..., I thought it'd already work out-of-the-box. I'm thinking that the top-level /Manifest.toml has entries like

[[App1]]
deps = ["Compat"]
path = "App1"  # relative path
uuid = "945464c1-6d4c-46d5-ac5e-fb453f79302f"

[[App2]]
deps = ["JSON"]
path = "App2"  # relative path
uuid = "32a02d8f-a56a-4e64-b0fa-04f84568876c"

and /App1/Project.toml is something like

name = "App1"
uuid = "945464c1-6d4c-46d5-ac5e-fb453f79302f"

[deps]
Compat = "34da2185-b29b-5c13-b0c7-acf172513d20"

[compat]
Compat = "2.0, 3.0"

I thought compat of /App1/Project.toml and /App2/Project.toml are considered when I run Pkg commands like up and resolve on /Manifest.toml. Am I misunderstanding how Pkg works? Or are there other concerns I am missing?

For more advanced usages like using /Manifest.toml when /App1/Project.toml is activated, I imagine that the sub-projects feature #1233 can be repurposed.

Roger-luo commented 4 years ago

I guess this is something related: so I'm providing standalone binaries via PackageCompiler and GitHub Actions for CLIs made by Comonicon. This is done automatically, but it can still be cumbersome to install these applications. Since they are standalone, I might expect that users may not have a valid julia compiler locally, thus I provide a shell script for a one-click installation experience: https://github.com/Roger-luo/IonCLI.jl/blob/master/setup , so users can install an application like this directly via

bash -ci "$(curl -fsSL https://raw.githubusercontent.com/Roger-luo/IonCLI.jl/master/setup)"

so I notice there is https://sh.rustup.rs in rust community, I'm thinking if we could have something like julialang.sh (this domain is not claimed yet) as a short domain for applications registered in the registry, then users can install using installers via something like https://julialang.sh/<app name>

mauro3 commented 3 years ago

It seems to me that what is being discussed here does not square with "Application" as defined in the glossary: "Application: a project which provides standalone functionality not intended to be reused by other Julia projects. For example a web application or a commmand-line utility, or simulation/analytics code accompanying a scientific paper." (emphasis mine) As for simulation/analytics code, I don't need any command-line tool installed. So, maybe there is a need to add another term to the glossary, something like to "a-bunch-of-scripts". So a project could be a Package, an Application, or a-bunch-of-scripts.

As far as I can tell, most of what's discussed here does not apply to a-bunch-of-scripts projects.

Roger-luo commented 3 years ago

what does a-bunch-of-scripts projects look like typically here? is it still a Julia project? or just a folder of scripts?

kescobo commented 3 years ago

what does a-bunch-of-scripts projects look like typically here? is it still a Julia project? or just a folder of scripts?

I don't know if it's typical, but I recently made this, which has a julia package structure, but also a notebooks/ directory that has analysis code where I do using ResonanceMicrobiome. But I wouldn't call this an application (which I think is @mauro3 's point?), and I don't think I'd need the functionality being discussed here for that project.

mauro3 commented 3 years ago

Yes, something like this. Although mine look a lot less fancy than @kescobo's ;-) But the Project.toml & Manifest.toml are essential to keep track of the deps, so it's a project.

Edit: this https://github.com/luraess/julia-parallel-course-EGU21 would be a good example as well. It's not for a paper but a course with a bunch of scripts which the students can then run.

simonbyrne commented 2 years ago

I kicked off a related discussion on discourse: https://discourse.julialang.org/t/tooling-for-julia-command-line-scripts/73915?u=simonbyrne But had a few more thoughts as to specifically what exactly an "application" might be, and how this could work:

  1. An application is a project with one or more "endpoints": these could include:

    • a script (.jl) file which handles command line arguments via ARGS (i.e.. the simplest possible application could be a Package.toml and a single script)
    • a function main(ARGS)-style function contained in a module
    • a dynamic library (e.g. as created by PackageCompiler.create_library
    • other files that are either:
      • stored in the package
      • stored in an associated artifact
      • generated in an associated scratch space

    Note that this definition could include things like libraries or executable products in _jll packages, so the application mechanism could encompass functionality like that of ygg.

  2. Installing a package would consist of a process like: a. instantiating a package environment in some directory (e.g. ~/.julia/apps/tree-hash/) b. resolving the endpoints: there could be various user-configurable options for how to do this, e.g.

    • whether or not to build a system image
    • whether to compile main(ARGS)-style endpoints as executables, or use a simple shell shim script
    • what command line args to pass to julia

    c. (optionally) symlink these into ~/.julia/bin (or some other directory). Again, there could be various options here such as:

    • link only a subset of endpoints
    • rename endpoints (e.g. to support having multiple versions of a package installed at a time)

I'm sure there are lots of things I'm missing here (e.g. would this approach work on Windows? how to handle things like header files for libraries?), but I would be keen to hear other people's suggestions.

simonbyrne commented 2 years ago

I've put together a minimal prototype of my idea here: https://github.com/simonbyrne/PkgApp.jl