enso-org / enso

Hybrid visual and textual functional programming.
https://ensoanalytics.com
Apache License 2.0
7.36k stars 324 forks source link

CLI Distribution Design Discussion #987

Closed kustosz closed 4 years ago

kustosz commented 4 years ago

Summary

This task is a place for @wdanilo to gather his feedback regarding the Enso CLI distribution schemes.

Value

It's important for @wdanilo to be on board with the proposed design, therefore we need a single place where feedback can be gathered.

Specification

Acceptance Criteria & Test Cases

A design consensus is reached and has been incorporated into the documentation.

wdanilo commented 4 years ago

Few comments:

Enso Home Layout

  1. For macOS and linux distributions that’s ~/.enso, by default - this is bad design. The correct one is:
  2. I don't really like that the folder dist contains enso-VERSION while the folder libraries/Dataframe contains VERSION. I think we should use justVERSION in both cases.
  3. I think we should use lib instead of libraries, or expand all other names, including distribution and java_virtual_machine. I think that using lib instead of libraries is much better.
  4. I think that the downloaded libraries should be in cache/lib/src - after having a more advanced type checker we may want to store pre-compiled information there. The pre-compiled information would be stored in cache/lib/something and it would be a complex structure (a cache per every combination of libraries used).
  5. The folder jvm should be renamed to runtime.
  6. The foldercache should be removed, and the folders lib and resolver should be inside of the root folder, because the whole folder should be considered a cache - you can safely remove dist and runtime folders and the launcher script should re-intialize them. Thus, the launcher script shouldn't be part of the whole structure (all things that can be safely removed by the user should be placed in a cache-like folder, as described in the point 1 above).

Universal Launcher Script

  1. The documentation of universal launcher script is missing information about that it should be able to launch gui as well as enso gui (this could be a plugin which redirects it to enso-gui.

Layout of an Enso Version Package

  1. The folder name component sounds very strange to me. Why it is not just bin? Moreover, do we have to keep the .jar ext there?
  2. I believe the sentence "Contains all the pre-installed libraries compiler version" is not a correct English one.

Resolvers

  1. The highlighting of the example is so bad, that we cannot read it. CC @joenash

The package.yaml File

  1. It also includes the list of dependencies of the package. - this is not true.It should include versions of packages in extra deps but for deps everything is taken from imports + lts

  2. Licence should either default to MIT or should default to None, but should berequired when uploading package. Otherwise a lot of people will forget to set it and a lot of licenses will default to none, which will every bad.

  3. Version should be optional - this should follow our 0-config approach. It should default to "dev" which means - this is development version which cannot be published until set explicitly, but you should be able to work with local lib with only name set.

  4. Resolver should be optional. Again, should note allowed to be published without it,but locally itshould work taking the system-set global resolver.

Launcher Distribution

  1. except that the component directories are empty [...] - if they are missing, they should be created on-demand.
  2. Where in project config / dist config the required graal version is described?
  3. "Global User Configuration" - this doc mentions the "config" folder inside of home enso layout, but it is not described there. Moreover, I believe this folder should be named global-config instead.
  4. We should allow to install multiple launcher distributions side-by-side just in case a downgrade would be needed.
radeusgd commented 4 years ago

Below I summarise the conclusions of our meeting with Wojciech and Marcin. The documentation will be updated soon to reflect the conclusions.

Enso Home Layout

  1. We may want to switch from distributing a portable launcher directory to distributing just a portable launcher executable and use the system-defined directories for storing downloaded components (engine, JVM, libraries). Before making a final decision @kustosz wanted to discuss this with @iamrecursion.
  2. [x] We will use just VERSION instead of enso-VERSION when naming the distribution that is put in the dist folder.
  3. [x] We will use lib.
  4. [x] We will put library sources in lib/src, because lib may contain some additional caches in the future. Some of these caches may be stored in the lib directory while some, that are local to a given project, in the .build/cache directory within the project root. This is, however, a separate topic that will have to be designed when working on cache storage.
  5. [x] jvm will be renamed to runtime.
  6. [x] lib and resolver will be in the root folder. All files there can be safely removed and the launcher will re-initialise them.

Universal Launcher Script

Layout of an Enso Version Package

  1. [x] The component directory name stays, but the distribution structure should be updated to include the runtime.jar. Moreover, enso.jar should be renamed to runner.jar - and this rename should be done in the whole project, remembering to update the references in the launch script (still keeping the script launching runner called enso).
  2. [x] The typo in the comment to std-lib in the distribution should be fixed.

The package.yaml File

  1. [x] Update docs - package.yaml does not list the dependencies as they are inferred, just the extra-dependencies.
  2. [x] None will stay as a default, but documentation has to emphasise that a package cannot be published without setting this setting explicitly.
  3. [x] Version will be optional (and a missing value will default to dev). A package must have a version specified to be published.
  4. [x] Resolver will be optional, package cannot be published without it, but locally it can just use the system-default.

Launcher Distribution

This section heavily depends on whether the launcher will be distributed (as described in the first section).

  1. This should be updated once the distribution scheme is resolved (the directories will always be created on-demand, but if the launcher is distributed as a directory structure they can be pre-defined to more clearly indicate the structure).
  2. [x] The manifest contains this information. The manifest file has to be included in the root of the Enso Version Package, in addition to being released as a separate artifact.
  3. This should be updated once the distribution scheme is resolved. Somewhere (in the distribution root or in the system config directory) a YAML file containing global configuration should be stored.

    We should allow to install multiple launcher distributions side-by-side just in case a downgrade would be needed.

  4. [x] This should be noted in the documentation as a plan for future. It won't be supported right away, but should be added at some point in the future.
joenash commented 4 years ago

@wdanilo

The highlighting of the example is so bad, that we cannot read it. CC @joenash

There's a new style for dark mode incoming (https://github.com/enso-org/enso-org.github.io/pull/29) that also fixes the syntax highlighting style (monokai on dark background throughout for light and dark), needs a bit more work. For now, I'd recommend changing to one of the other styles via the selector at the top of the page

radeusgd commented 4 years ago

After a discussion with @iamrecursion and @kustosz, we suggest that we reconsider the home layout design. Putting Enso files in the system defined directories instead of a self-contained portable directory brings a few issues:

  1. There is increased maintenance, as instead of having a single way to do it, we need to maintain 3 different code paths for each platform, as these directories are handled differently on each platform.
  2. If we use the system-defined directories, we still need some universal fallback solution, as, especially on Linux, it is possible for an OS distribution not to comply with this standard (and for example not allow to write in the directory that XDG defines as default; It seems reasonably possible for a user to not allow applications to write directly to their $HOME directory root which may be necessary to create for example the .local/data directory if it does not exist on their system).
  3. If we put files into multiple directories that the user may not immediately be aware of, we would need to provide an uninstaller capability in the launcher to remove these files. The uninstaller is another additional maintenance burden and may be especially problematic on the Windows platform (the uninstaller may have trouble deleting locked files, for example files that are open somewhere).

If we still decide to use the system-defined paths instead of a self-contained executable, we suggest that the runtime, dist and lib directories should not be stored in $XDG_CACHE_HOME. We suggest it should be put into $XDG_DATA_HOME. The XDG standard defines the cache as non-essential files. The corresponding cache directory on Mac presumably may be cleared automatically by the system if it is running out of space. While, in theory, the launcher can work just fine by redownloading the dependencies, in practice, the files should be treated as essential for the application. If the cache is removed, the user may find that they need to re-download 500MB of runtime and even more libraries - sometimes this may not be viable if they are on a mobile network or completely offline, for example on a plane, making the application completely unusable in such circumstances.

wdanilo commented 4 years ago
  1. I was sure we are talking about 2 paths here. Could you describe a mapping between XDG env vars and components that we will place there, please?

  2. A short note on the beginning - XDG env vars could not be defined, which can happen fairly often because according to the spec, apps should use default values then, for reference see this Ubuntu issue. Regarding not being able to write in a default location, like $HOME/.local I have never seen such behavior. Could you tell me what mechanisms do you have in mind when telling not allow applications to write directly to their $HOME directory? Applications are run using the users' group, and thus inherit the users permissions. If you run an app and the app can't write to that dir, it means you cannot write there as well. Do you mean that the app could be run using a different user / different user group than the host users' group? This seems very rare and in the last 10 years of using Linux I have not seen such a configuration. Anyway, there is a valid question - what if XDG vars point to a folder where the user has no write-rights (like some root folder). Then our installer should fail and report it.

  3. That's right, an uninstaller should be done as part of this task. The uninstaller is another additional maintenance burden - sure, but does it mean that we should instead mess with users directories in a way the system was not designed to? We need to provide the correct behavior, the one expected by the host OS and provide high-quality software. I'm deeply concerned hearing an argument about "maintenance burden" for something which is just meant to remove 3 folders and report an error if its not possible when alternative solution would just be messing with system folder layout. For me it's a little bit like telling - let's not care about the theme of our visual editor, lets just use random values, otherwise, there is a maintenance burden. Of course, this argument may be valid if this would be a very big component, but uninstaller will be small, so this argument is non-existent to me now. Regarding a situation where it is impossible to remove a folder (different situation than the folder was already removed by user), the installer should report error and just list all folders that it tried to remove to uninstall the software.

radeusgd commented 4 years ago
  1. What I meant by 'code paths' here was different solutions that may be required by each platform, but maybe these differences aren't that big.

As for the mapping, I think the idea is to put components (Enso versions, runtime and libraries) in XDG_DATA_HOME and configuration in XDG_CONFIG_HOME (or to be precise in an enso subdirectory in each).

  1. Yes the scenario assumed the application is run as some restricted group, for example if the user wants to ensure sandboxing and only allow it to write to its own directory, but not the home root. But that indeed sounds very rare and probably for each possible scenario we can come up with a pathological configuration that will not work, so it probably should not be a concern.

Maybe to support alternative usecases we can support environment variables like ENSO_DATA and ENSO_CONFIG, which, when not set, default to the paths defined by XDG_. Then if the installer finds out that a default XDG_ path is not writable, it can report the issue to the user and tell them how to go around it using these variables?

  1. I understand that this may be problematic on Windows, but certainly doable, so I will add it as part of the task. I think from our point of view the alternative was not 'messing with system folder layout' but having all application data in one directory that the user can place anywhere they want. We saw this as a reasonable alternative, so just wanted to add that also uninstalling would be much simpler (because the user could just remove the root Enso directory).

I would like to clarify:

  1. We wanted to distributed downloadable bundles that include the latest Enso version and the GraalVM version corresponding to it. They were meant to just reflect the Enso directory structure. Should the launcher binary provide an install-itself option that moves these files around from a downloaded archive or should another executable be created that is an installer for the launcher?

  2. If we go the uninstaller route, do we want to register it in Windows registry to show up in the 'Programs' list to uninstall? My only concern here is that by default the launcher is a CLI application (and I assume an installer for it would also be one - it seems only logical for the Unix platforms, I don't know many applications (especially CLI ones) that provide a GUI installer for Unix). For Windows however, it seems like installers/uninstallers are almost always GUI (however maybe we could go for a command line based one with a simple Yes / No UI?), so should a separate solution be created for this platform? Or are we fine with the launcher having an uninstall-itself command?

iamrecursion commented 4 years ago

Maybe to support alternative usecases we can support environment variables like ENSO_DATA and ENSOCONFIG, which, when not set, default to the paths defined by XDG. Then if the installer finds out that a default XDG_ path is not writable, it can report the issue to the user and tell them how to go around it using these variables?

I like this approach.

We wanted to distributed downloadable bundles that include the latest Enso version and the GraalVM version corresponding to it. They were meant to just reflect the Enso directory structure. Should the launcher binary provide an install-itself option that moves these files around from a downloaded archive or should another executable be created that is an installer for the launcher?

My instinct would be to have it check for the bundle files on launch, and if they are found to move them to the correct directories, rather than having a separate installer.

If we go the uninstaller route, do we want to register it in Windows registry to show up in the 'Programs' list to uninstall? My only concern here is that by default the launcher is a CLI application (and I assume an installer for it would also be one - it seems only logical for the Unix platforms, I don't know many applications (especially CLI ones) that provide a GUI installer for Unix). For Windows however, it seems like installers/uninstallers are almost always GUI (however maybe we could go for a command line based one with a simple Yes / No UI?), so should a separate solution be created for this platform? Or are we fine with the launcher having an uninstall-itself command?

I don't think we should bother with this for now. In time we may want to, but for now we just need to have something that works, even if it's a bit different for windows.

wdanilo commented 4 years ago
  1. Radek, you told in your first message about 3 paths, I asked for clarification, now you mentioned 2 paths instead - XDG_DATA_HOME and XDG_CONFIG_HOME. So are there 2 or 3 paths? Moreover, you told me that you want to put "configuration" to XDG_CONFIG_HOME, but what is "configuration"? Could you just show all folders with detailed info what goes where, please? I really need to see it to be sure its a good fit everywhere.

    I've got yet another idea for improvement here - you want to keep this "self-contained" thing, right? So, when starting things you can check whether in the current directory there are sub-directories that you need (if so, you use them), if not, you use ENSO_ env vars. If not defined, you use XDG_ env vars. This approach would give us both self-hosted packages that you can just move around, and also possibility to install it properly on the system. What do you think about it? CC @kustosz @iamrecursion

  2. Regarding ENSO_DATA etc, I love it. Lets do it this way.

  3. @iamrecursion do you want it to behave this way that on the first run the files may be distributed to different locations? I feel it would be better to have separate install step instead to catch errors during installation instead of mixing them in run behavior.

  4. As Ara tells.

radeusgd commented 4 years ago
  1. I meant 3 'code paths' (as in 3 different solutions for each platform, but I made a typo), not file-system paths.

Configuration for now would be a single global-config.yaml file that would contain the default values for creating new projects (author.name, author.email if set) and the default Enso version (the one that is used when creating a new project and running a REPL outside any project).

I propose we put all data files, i.e. almost all directories (save for bin and config), in XDG_DATA_HOME.

So, we would create a directory $XDG_DATA_HOME/enso/ where we would put the directories dist, runtime, lib and resolvers.

Moreover, we'd have $XDG_CONFIG_HOME/enso/global-config.yaml as explained above.

As for the executable - I guess as the user downloads the executable, they can put it anywhere on their PATH, is that right?

1b. Personally I really like this idea, as I, personally, prefer the self-contained distributions when installing software on my system. I'm worried it may be a bit unintuitive for users to have to different ways of configuring the installation, but I guess if it's well documented it may work. I have seen other software distributed in a 'normal' and 'portable' version, so maybe it would make sense to allow the users to choose which one they prefer.

iamrecursion commented 4 years ago

I don't think it's a good idea to have two modes at the moment. Maybe in the future that could be useful, but for now there's enough to do already without adding additional stuff.

wdanilo commented 4 years ago

As for the executable - I guess as the user downloads the executable, they can put it anywhere on their PATH, is that right?

Absolutely not. Executables should be installed in right place, according to XDG rules. It's not an "arbitrary place" - it has te be in their PATh etc and XDG tells where such things should go. Of course user can just move / copy it anywhere, but it should be installed in a good place. Anyway, I really need to see this described clearly, I mean something like this:

$XDG_DATA_HOME/enso/ will contain: ... here the file structure like in the docs

etc, including all files, like binaries - of course it can be done in docs, but please ping me as soon as it will be shown as clear as in the docs now - for all files.

I don't think it's a good idea to have two modes at the moment. Maybe in the future that could be useful, but for now there's enough to do already without adding additional stuff.

Surprisingly I don't agree with you this time, Ara. I think that if we are introducing checks for ENSO_... env vars, then checking there if the local folders exist is just the same logic. So if we think this is a good idea to allow for both portable and non-portable distributions, then from a time perspective either lets implement ENSO_... vars and these local checks or don't implement both of them, as the underlying logic seems just the same to me.

radeusgd commented 4 years ago

Absolutely not. Executables should be installed in right place, according to XDG rules. It's not an "arbitrary place" - it has te be in their PATh etc and XDG tells where such things should go. Of course user can just move / copy it anywhere, but it should be installed in a good place.

Then does the user download the launcher binary (if so, I have no control over where the user places is) or some kind of installer? It comes down to whether we want to have a separate installer.

XDG itself seems not to define a default directory for binaries that is guaranteed to be on system PATH. The XDG_RUNTIME_DIR may be misleading, but it is for non-essential runtime files and is defined to be cleaned when the user logs out. Ubuntu, by default (in the default .profile created for new users) adds $HOME/bin and $HOME/.local/bin to PATH if they exist, but I could not find any standard that defines these - so I don't know if this is portable for different distributions.

My proposal for the installation structure:

$ENSO_DATA_DIRECTORY  (defaults to $XDG_DATA_HOME/enso)
├── dist                    # Per-compiler-version distribution directories.
│   ├── 1.0.0               # A full distribution of given Enso version, described below.
│   │   └── <truncated>
│   └── 1.2.0               # A full distribution of given Enso version, described below.
│       └── <truncated>
├── runtime                 # A directory storing (optional) distributions of the JVM used by the Enso distributions.
│   └── graalvm-ce-27.1.1
├── lib
│   └── src                 # Contains sources of downloaded libraries.
│       └── Dataframe       # Each library may be stored in multiple version.
│           └── 1.7.0       # Each version contains a standard Enso package.
│               ├── package.yaml
│               └── src
│                   ├── List.enso
│                   ├── Number.enso
│                   └── Text.enso
└── resolvers               # Contains resolver specifications, described below.
    ├── lts-1.56.7.yaml
    └── lts-2.0.8.yaml

$ENSO_CONFIG_DIRECTORY  (defaults to $XDG_CONFIG_HOME/enso)
└── global-config.yaml  # Global user configuration.
wdanilo commented 4 years ago

XDG_BIN_HOME (which defaults to $HOME/.local/bin) https://lists.freedesktop.org/archives/xdg/2017-August/013943.html (as we've been chatting on Discord)

[PATCH v2 2/2] basedir: Add XDG_BIN_HOME
radeusgd commented 4 years ago

After a discussion with Wojciech we set for the following:

The launcher when run will look for files next to it, if it detects the directories (runtime, dist, etc.) it works as a portable distribution. Otherwise it uses the ENSO_DATA_DIRECTORY and ENSO_CONFIG_DIRECTORY, as described above, with the following defaults:

On Windows, probably both ENSO_DATA_DIRECTORY and ENSO_CONFIG_DIRECTORY could default to %AppData%/enso.

enso install distribution copies the launcher file to the default directory for locally installed binaries for each platform and, if present, moves runtime and component files to the locations as defined above. It effectively converts a portable distribution into an installed one.

enso uninstall distribution removes the ENSO_DATA_DIRECTORY and ENSO_CACHE_DIRECTORY and removes the enso binary from the 'default directory for locally installed binaries'.

The 'default directory for locally installed binaries' is defined as following:


As a side note, as we modify the behaviour of install/uninstall commands, old behaviour will be available under enso install engine VERSION and enso uninstall engine VERSION respectively.

wdanilo commented 4 years ago

Hmm, I believe it should be ~/.local/bin on macOS too, but this needs proper investigation. Amazing summary Radek, amazing work <3

radeusgd commented 4 years ago

After asking @mwu-tow, he confirmed that we should most likely put everything (data, config and the binary) into %AppData%/enso or %LocalAppData%/enso - it seems like both are used by some applications (for example, IntelliJ, VSCode, Discord use %LocalAppData%, while Spotify and Zoom use %AppData%. On some configurations %AppData% may be synchronised between accounts, so it may be safer to go for %LocalAppData% for storing lots of files, but both should be fine.

We cannot make any assumptions about users PATH. But the installer can modify the users part of the PATH (which does not require administrator privileges), adding %AppData%/enso/bin to it, using the PowerShell command [Environment]::SetEnvironmentVariable.


As for Mac, I think you are right about ~/.local/bin. At least I found a discussion on a similar issue in a different project, and they seem to have went for ~/.local/bin for MacOS. Maybe we could default to this location and, when installing, check if it is present on system PATH. If it is not, the installer can issue a warning to the user that the PATH should be modified for the thing to work.

iamrecursion commented 4 years ago

I'm happy with the above actionable steps!