rust-lang / libs-team

The home of the library team
Apache License 2.0
110 stars 18 forks source link

ACP: env::home_dir replacement #372

Open notpeter opened 2 months ago

notpeter commented 2 months ago

Proposal

Add new API: std::env::user_home_dir returning Option<PathBuf>.

This is a replacement for the deprecated std::env::home_dir, providing a platform agnostic method for identifying the user's home directory checking platform-specific environment variables ('HOME' on Unix, 'USERPROFILE' on Windows) as required.

The proposed implementation is simpler than std::env::home_dir as only searches the environment and does not attempt call one/multiple platform APIs (e.g. getpwuid_r, GetUserProfileDirectoryW, etc) as fallback when the appropriate environment variables are unset.

The existing env::home_dir remains unchanged.

Problem statement

  1. The std::env::home_dir API is deprecated. Fixing it would be a breaking API change. (See: rust-lang/rust#51656)

    Deprecated since 1.29.0: This function’s behavior may be unexpected on Windows. Consider using a crate from crates.io instead.

  2. In most home_dir implementations, (including std::env::home_dir) if the 'HOME' / 'USERPROFILE' environment variables are unset, there is fallback implementation that calls one/multiple platform APIs in order to guess the user's home directory. This is potentially surprising behavior, as it is not call an env::* function may magically get you a value which is nowhere in the process environment.

  3. The std::env::home_dir API may return a value which is not usable as a Path (""). This occurs when HOME set to an empty string, which is not uncommon in restricted shell environments (sudo, cron jobs, etc). Callers thus must check for None and for an empty string before using the returned value. Earlier documentation for std::env::home_dir suggests this may have been the original intended behavior:

    "Returns the value of the 'HOME' environment variable if it is set and not equal to the empty string."

  4. Users looking for a replacement for std::env::home_dir are forced to evaluate and choose amongst a number of 3rd party crates:

    • The 'home' crate, maintained by the Cargo team, is the very popular and used by (nearly 500 crates but the Cargoteam has stated they do not wish to maintain it as a general purpose 'home_dir' replacement and they consider it only an internal Cargo and Rustup dependency.
    • The 'home' crate generates a compile time error on wasm and other non-Unix/Windows platforms. The Cargo team has signaled they do not intend to fix this. See: rust-lang/cargo#12297
    • Many crates do not just provide a std::env::home_dir replacement but also provide additional abstractions over other platform-specific APIs (XDG, Windows Known Folders API, etc)
    • Many crates have platform specific dependencies. For example dirs crate depends on libc and windows-sys which are entirely unused in the common case where the environment has HOME or USERPROFILE set.

Motivating examples or use cases

Developer wishes determine the current user home directory via inspecting environment variables in a platform agnostic way.

The same use case as std::env::home_dir but without the bugs or fallback behavior when environment variables are unset.

Solution sketch

Create the API which covers the most common use of std::env::user_home_dir getting a PathBuf with of user's home directory, if available, by inspecting environment variables:

See: env_home lib.rs for a potential implementation.

Naming is hard, especially when a deprecated API has good name. I prefer std::env::user_home_dir as it does not collide with std::env::home_dir or home_dir provided by other crates.

Alternatives

  1. Do nothing. Require users to evaluate and choose a 3rd party crate which provides a home_dir API with better behavior on Windows (usually including extra APIs):

  2. Extract the current cargo implementation of home_dir from the home and trivial fixes so it compiles as a no-op on unsupported platforms. Publish as a stand-alone crate and recommend it as a replacement for std::env::home_dir.

  3. Create an implementation of home_dir that only relies on the platform specific APIs and does not inspect environment variables. This would make it trivial to implement the platform API fallback behavior of std::env::home_dir when environment variables are unset.

Links and related work

Discussions:

Languages whose standard library provides a platform independent home_dir abstraction:

Languages which do not have a home_dir abstraction in the standard library:

Notes

I am particularly interested to hear of real-world use cases or systems where $HOME is unset and the directory provided via platform-specific APIs is available/appropriate for use. In my experience $HOME being unset is a signal that there there is nothing like a unix home directory available to my process and my app should behave accordingly.

This is my first attempt at drafting an API Change Proposal, so please be kind. Thanks!

m-ou-se commented 2 months ago

How often would you want user_home_dir instead of something more specific like config_dir or cache_dir or user_downloads_dir? I'm afraid that providing only user_home_dir and not any of the others will just lead to programs that use the wrong directories. (E.g. using ~/.myapp/ for config, rather than ~/.config/myapp/ (using e.g. $XDG_CONFIG_HOME).

notpeter commented 2 months ago

I appreciate the work that the Cross-Desktop Group (XDG) did on XDG Base Directory Specification to define some standardized directories. In practice none of the three environments I use daily (Windows, MacOS, Linux CLI) have XDG_*_HOME variables defined. But they have successfully gotten apps to adopt XDG defaults as their hard coded directory fallbacks:

XDG_CONFIG_HOME $HOME/.config
XDG_CACHE_HOME  $HOME/.cache
XDG_DATA_HOME   $HOME/.local/share
XDG_STATE_HOME  $HOME/.local/state

But those still require determining $HOME/$USERPROFILE in a platform dependent way.

Nudging developers away from blindly creating homedir dotfiles is a worthy cause, but I actually think the docs for std::env::user_home_dir() are an ideal place to do so. Just include ~/.config/appname/something.cfg as the usage example. Alternatively, one could also add std::env::user_config_dir. Personally I wouldn't want that to only check $XDG_CONFIG_HOME on Linux and otherwise just use $HOME/.config (and not use $APPDIR or $HOME/Library/Preferences on Win/Mac) but I imagine others would have differing opinions. If there is consensus, I'd be happy to add user_config_dir to the proposal as well.

As for my non-config use cases, it's mostly reading/writing config/data for the known location used by another app which I do not control. For example checking the default install directory of a proprietary SDK (~/Developer/PlaydateSDK), pulling from another apps configs for populating my defaults (.gitconfig) or just opening the file picker by default to a friendly location.

In the end it's just an ergonomics thing. What prompted this rabbit hole was finding out an app was broken because it checked multiple locations searching for a file, one of which was under $HOME and this panic'd under Windows (HOME unset). I went to the standard library, found std::env::home_dir, but it was deprecated and broken under Windows. I added the home crate but was bummed that this pulled in windows-sys as a transitive dependency which felt super overkill. So I rolled my own. Hilariously even that was initially broken because I'd typo'd USER_PROFILE instead of USERPROFILE (doh!).

It'd be nice if others, especially developers new to rust, didn't have to go through the same effort and pain. We can have nice things! :D

ChrisDenton commented 1 month ago

But those still require determining $HOME/$USERPROFILE in a platform dependent way.

Note that XDG says nothing about Windows. When porting the spec to Windows it's typical to implement it in terms of LOCALAPPDATA and APPDATA rather than USERPROFILE.

djc commented 1 month ago

How often would you want user_home_dir instead of something more specific like config_dir or cache_dir or user_downloads_dir? I'm afraid that providing only user_home_dir and not any of the others will just lead to programs that use the wrong directories. (E.g. using ~/.myapp/ for config, rather than ~/.config/myapp/ (using e.g. $XDG_CONFIG_HOME).

@m-ou-se so do you think this proposal would be more likely to be accepted if it also proposed including config_dir() and cache_dir()? That seems like a substantial increase in complexity, but I agree it might be worth it.

(I think user_downloads_dir() is probably much less relevant, but may still be worth it.)

@notpeter would you be interested in helping to drive that forward?

(See also some discussion in https://internals.rust-lang.org/t/pre-rfc-split-cargo-home/19747.)

notpeter commented 1 month ago

@notpeter would you be interested in helping to drive that forward?

@djc I've implemented this on a branch. See: notpeter/env-home@more_dirs

Notes:

Supporting other well-known directories (Downloads, Documents, etc) on Windows requires the platform APIs because these folders may be relocated. This increases complexity, but cache_dir and config_dir (like home_dir) can be done correctly with only environment variables everywhere.

One side benefit the name user_home_dir felt a little awkward, but if we add user_config_dir and user_cache_dir the naming feels consistent.

djc commented 1 month ago

Basically it's choice between 'MacOS is unix' or 'MacOS is special like windows' and I chose the former. I don't have a strong opinion here and am open to persuasion.

As a macOS user, I definitely think we should go with the platform-specific behavior here.

notpeter commented 1 month ago

As a macOS user, I definitely think we should go with the platform-specific behavior here.

I've implemented this behavior on an alternate branch. See: notpeter@env-home@more_dirs_mac

Adds MacOS specific behaviors:

joshtriplett commented 1 month ago

The iterations on this are already demonstrating that this gets substantially more complex and bikesheddable the more we extend it.

Could we please just add a function to return the home directory, first, and then consider whether we want the full complexity of defining a cross-platform suite of named directories to store things?

notpeter commented 1 month ago

@joshtriplett I agree. I tried to write as narrow a proposal for user_home_dir as I could for exactly this reason.

Having explored user_{cache,config,etc}_dir a bit, there are design decisions that need to be hashed out. I will try and draft something which explores that space, so for now let's stick to user_home_dir here.

Timmmm commented 1 month ago

I wonder if the concept of a home directory is not even cross platform enough to warrant a generic API. E.g. on iOS it can't return anything sensible. Maybe on Android too, and I'm not sure about Fuchsia.

Maybe it would make more sense to jump straight to platform specific APIs? std::unix::home(), std::windows::appdata() and so on.

I don't know the answer; just wondering.

joshtriplett commented 1 month ago

I think there's a reasonable concept on many platforms, and on those where it doesn't exist, it can return an error.

On Android platforms it would probably be reasonable to return something in the application's directory.

scottmcm commented 1 month ago

The point in https://github.com/rust-lang/libs-team/issues/372#issuecomment-2085766879 really resonates with me -- what can I actually do correctly, in a cross-platform way, by knowing just the home directory? If returning something in the application's directory is reasonable for android, would it maybe make sense to return something under an app-specific AppData in Windows?

Is there a survey of what people want to do with the result of this? The motivation in the OP here is just the circular "Developer wishes determine the current user home directory", but that's only part of the means to a more specific end. (Are they just trying to translate ~ in a path? Or...)

programmerjake commented 1 month ago

maybe look at https://wiki.libsdl.org/SDL3/SDL_Folder and https://wiki.libsdl.org/SDL3/SDL_GetUserFolder for inspiration

Timmmm commented 1 month ago

Is there a survey of what people want to do with the result of this?

Excellent question. We should look on https://grep.app to see how people have used the existing APIs/crates.

notpeter commented 4 weeks ago

Is there a survey of what people want to do with the result of this? -@scottmcm

I spent a few hours spelunking through open source code with SoureGraph Search (example query) for usage of env::home_dir and the popular crates (home, dirs, directories). This was largely manual and certainly not scientific, but I was able to observe a number of patterns of use:

  1. configuration / app data:
    • file or dir: ~/.appname or ~/.appname/
    • xdg-style ~/.config/appname
    • xdg-proper $XDG_CONFIG_HOME/appname or else ~/.config/appname
  2. external data: other applications config or data under ~
  3. Converting ~: Make ~ an absolute path or displaying a path starting with ~.
  4. Other less-common usages:
    • Change directory to ~, like cd with no argument.
    • sockets: ~/.appname/app.sock
    • Logs: ~/appname.log
    • XDG-style MANPATH: ~/.local/share/man
    • ~/Downloads, ~/Documents, ~/Pictures etc

After seeing hundreds of call sites, I realized there is no "technically correct" platform-specific method for resolving the path of various directories -- because the locations themselves are API via 'well-known-paths'. Historically, this is just ~/.appname: $HOME/.appname on unix and $USERPROFILE/.appname on Windows.

How often would you want user_home_dir instead of something more specific like config_dir or cache_dir or user_downloads_dir? I'm afraid that providing only user_home_dir and not any of the others will just lead to programs that use the wrong directories. (E.g. using ~/.myapp/ for config, rather than ~/.config/myapp/ (using e.g. $XDG_CONFIG_HOME). - @m-ou-se

I've come around on this. There's nothing wrong with using ~/.myapp, it's just an API design decision. And there's tradeoffs. A huge proportion of software (even cargo!) doesn't use ~/.config/ and may never use os-specific directories.

It's not sexy, but most POSIX software looks for configuration inside at a statically defined location in a user's home directory. Accessing the existing clutter of dotfiles and dirs is actually the compelling use-case for user_home_dir. 🙃

fmarier commented 4 weeks ago

Accessing the existing clutter of dotfiles and dirs is actually the compelling use-case for user_home_dir.

To reinforce that point with a concrete example: in safe-rm, I have migrated the config file to the "right" location (i.e. ~/.config/safe-rm). However, this project started well before that convention was a thing and so for backward-compatibility, I also look for the (now-deprecated) legacy config file (~/.safe-rm).

At the moment, I'm looking at $HOME directly in order to avoid the deprecated function, which is obvious not ideal from a portability point of view. I'd love to switch to std::env::user_home_dir().