PowerShell / PowerShell

PowerShell for every system!
https://microsoft.com/PowerShell
MIT License
43.66k stars 7.08k forks source link

Provide a platform-agnostic way to determine the temp. directory, e.g., via an automatic variable. #4216

Closed mklement0 closed 4 years ago

mklement0 commented 6 years ago

Related: #3442 and #4215

The various supported platforms have different native ways to determine the directory in which to store temporary files and directories:

Currently, there is no platform-agnostic way to refer to this directory, which is cumbersome.

An automatic variable, say $TEMP, could provide this abstraction.

Environment data

PowerShell Core v6.0.0-beta.3
lukeb1961 commented 6 years ago

what about something like: (New-TemporaryFile).DirectoryName

mklement0 commented 6 years ago

@lukeb1961: That indeed yields the platform-specific temporary directory, but it has an unwanted side effect: New-TemporaryFile invariably creates the file whose [System.IO.FileInfo] representation it outputs.

In other words: every time you execute (New-TemporaryFile).DirectoryName, you leave an unused temporary file behind.

Additionally, it's a computationally expensive approach.

iSazonov commented 6 years ago

We've had discussed this in a nutshell. Thank you for opening the Issue.

Internally we already have "TemporaryDirectory", just "GetTemporaryDirectory()" and "CreateTemporaryDirectory()". We could make it public.

Liturgist commented 6 years ago

How about going with $Env:TMP on all platforms? That is not quite like Windows, Mac, or Linux. No turf war.

mklement0 commented 6 years ago

@Liturgist: $Env:TMP is not an option, because it is ill-advised to either (a) have PowerShell automatically define environment variables or (b) pretend that non-existent environment variables exist.

An automatic PowerShell variable such as $TMP is an option, however, although the thing to keep in mind is that retroactively introducing automatic variables is technically a breaking change.

SteveL-MSFT commented 6 years ago

If we went the variable route, it would have to be something like $interop:TEMP to avoid name collision. A variable is probably easier to use in scripts than an API although we can certainly support both.

In https://github.com/PowerShell/PowerShell/issues/3571 I suggested other variables in addition to TEMP

mklement0 commented 6 years ago

@SteveL-MSFT:

Good point; an $interop: namespace is indeed a way to avoid collisions.

If we take a step back: If we want PowerShell to become the lingua franca of the shell world, do we really want to frame things as "interop"?

How about trusting PowerShell's automatic variables to Do the Right Thing(TM) generically, which would call for a namespace such as $ps:?

SteveL-MSFT commented 6 years ago

@mklement0 the $ps: prefix/namespace has been proposed before. It would be nice if we had used that from the beginning :)

Personally, I prefer $ps: for exactly your reasoning. Perhaps this can be a discussion point at the next Community Call.

cc @PowerShell/powershell-committee

mklement0 commented 6 years ago

I'm glad to hear it, @SteveL-MSFT.

On a related note, I think a separate namespace for all preference variables would be beneficial as well, such as $pspref:

The caveat is that even though collisions are far less likely than with unqualified variable names, someone could have done something like the following:

# Define custom drive 'ps:'
New-PSDrive 'ps' FileSystem '/tmp'; Get-ChildItem ps:
iSazonov commented 6 years ago

Why we cannot use [system.io.path]::GetTempPath()?

https://github.com/dotnet/coreclr/blob/6ba74dc2a7194f8d6c86c3aeab572a074ef645c8/src/mscorlib/shared/System/IO/Path.Unix.cs#L153 https://github.com/dotnet/coreclr/blob/6ba74dc2a7194f8d6c86c3aeab572a074ef645c8/src/mscorlib/shared/System/IO/Path.Windows.cs#L94

mklement0 commented 6 years ago

@iSazonov:

That works, but it is both hard to remember and cumbersome to type. Good to know as a workaround, though.

iSazonov commented 6 years ago

If we have New-TemporaryFile users expect Get-TemporaryPath of Get-TemporaryDirectory.

mklement0 commented 6 years ago

@iSazonov:

Since the term path is ambiguous (even though it is used in the .NET API to refer to a directory path in this case), my preference would be Get-TemporaryDirectory, and I certainly wouldn't mind having such a cmdlet.

That said, a cmdlet is a bit heavy-handed for something that returns a single piece of information with no variations in functionality.

Consider the existing Get-Host / $Host duo: Do you ever find yourself using Get-Host instead of $Host?

SteveL-MSFT commented 6 years ago

$ps:Temp would presumably return a static location to put temp stuff. Get-TemporaryDirectory would presumably create a new folder under $ps:temp. Personally, I think having a temp: PS drive may be more useful. Perhaps cleaned up automatically on process exit.

iSazonov commented 6 years ago

temp: - good idea. If we accept it we should have cmdlets for it. And I believe we can enhance *-PSDrive cmdlets.

temp: - initialized to [io.path]::GetTempPath().

We would make temp: optionally per runspaces.

dragonwolf83 commented 6 years ago

I would prefer a consistent way of mapping environment variables that are basically the same just expressed differently. Are we saying we will map all path variables to drives, like Home:\?

Liturgist commented 6 years ago

@mklement0 - I am not quite sure why Env:TMP is a bad idea. With regard to a) PowerShell creates Env:PSModulePath. With regard to b) I do not know which variables PowerShell would be pretending about. I would like to understand.

dragonwolf83 commented 6 years ago

@Liturgist I think the notion is that anything in the $Env: namespace should only be actual environment variables on that platform. It also would be very confusing to have $Env:TEMP and $Env:TMP be defined at the same time. How do I know which one is the one to use to run on Linux, OSX, and Windows just from looking at it?

The proposals for a separate namespace or drive eliminate that issue as it allows for PowerShell to have it's own commonly defined variables that can be used cross-platform and not block access to the real environment variables.

Another idea though for a namespace could be $PSEnv:. It has an implied meaning that anything in that namespace comes from PowerShell and not the system. Then you can cleanly map all the common environment variables and makes it easier to see what is available to use from IntelliSense.

Liturgist commented 6 years ago

@dragonwolf83 - Would Env:PSModulePath be moved to PSEnv:PSModulePath? I fully agree that that there should not be multiple variables such as Env:TEMP and Env:TMP. There should be one good one and let's go with that. Having PSEnv:TEMP or PSEnv:TMP would be fine.

To go with New-TemporaryFile it would be helpful to have New-TemporaryDirectory. To go with this, having a -Path parameter on New-TemporaryFile to specify the directory in which to create the temporary file would be helpful.

dragonwolf83 commented 6 years ago

No, I don't think the PowerShell team would move $Env:PSModulePath for three reasons.

  1. It would be a breaking change that doesn't add any value
  2. It is a real Environment Variable so it would make no sense to move it.
  3. There is no need to force a mapping of this environment variable to work cross-platform. It is something PowerShell can add to Environment Variables when installing PowerShell because they own this variable.

In other words, it exists exactly where it belongs.

Take a look at the screenshot below. It shows the Environment Variables defined on my Windows system. It has TEMP and TMP as environment variables for my user Temp path. It also has PSModulePath. So all of those other variables, I would expect to see in $Env: namespace. You can go into cmd and see that all of those variables, including PSModulePath exist there too.

image

mklement0 commented 6 years ago

@Liturgist:

To add to @dragonwolf83's helpful comments and to respond to your comment addressed to me:

I do not know which variables PowerShell would be pretending about. I would like to understand.

My perhaps glaringly-obvious-didn't-need-to-be-said point was that in the absence of PowerShell actually defining such an environment variable (which, as stated, would be ill-advised - also see below), PowerShell shouldn't pretend that it exists.


PowerShell respects / defines only two environment variables, according to Get-Help about_Environment_Variables:

These variables are environment variables for a good reason: they allow you to control PowerShell's startup behavior externally.

By contrast, there is NO good reason to define the directory for temporary files and directories as an environment variable:

If a reader is familiar primarily with cmd.exe, it is worth pointing out that cmd.exe - unlike PowerShell and POSIX-like shells on Unix - knows only environment variables: any variable you define in cmd.exe is implicitly also an environment variable, whereas PowerShell and POSIX-like shells distinguish between session-specific-only shell variables (e.g., $foo in PowerShell) and explicitly-designated-as-such environment variables (e.g., $env:foo in PowerShell).


I definitely like the idea of multiple namespaces for PowerShell's automatic variables (such as the suggested $PSPref: for preference variables), so putting the directory for temporary files in $PSEnv: is a good idea, though we'd have to make clear that - despite the presence of the word 'Env' - these aren't actual environment variables in the established, system sense (accessible to all child processes, irrespective of what executable is being run).

SteveL-MSFT commented 6 years ago

Seems like there's been enough good discussion that perhaps this should be a RFC?

iSazonov commented 6 years ago

I definitely like the idea of multiple namespaces for PowerShell's automatic variables

In [io.path]::GetTempPath() the [io.path] is a namespace - should we introduce a Powershell "native" namespaces like $PSEnv:?

mklement0 commented 6 years ago

@SteveL-MSFT:

Before we tackle an RFC, let me try to summarize what I perceive to be the use cases and perhaps get a better understanding of the scope of such an RFC:

As for the introduction of PowerShell-controlled namespaces:

Note that the current namespace notation - e.g., $env:PATH is actually based on having an underlying PS drive: all currently supported $<prefix>: "prefixes" refer to drives of the same name; e.g., $env:HOME is the same as (Get-Item Env:/Home).Value.

If we now introduce namespaces such as $PSEnv: without an underlying drive - and an underlying drive would arguably overkill - we'll be departing from that model.
I personally don't think it's a problem, but it should be a conscious, documented decision.


So what should an RFC cover?

I could tackle the former, but I'm not sure I can come up with a sensible approach to the latter (and it may not be worth doing).

SteveL-MSFT commented 6 years ago

@mklement0 thanks for summarizing all the different related discussions. I think just scoping the Motivation section of the RFC to the first item is sufficient. You can explicitly state that temp: is out of scope for the RFC.

Liturgist commented 6 years ago

@mklement0 - Very nice and helpful summary.

I would much rather have New-TemporaryDirectory over adding a -Directory switch to New-TemporaryFile. New-TemporaryDirectory is more obvious and discoverable. If both are not possible, I vote for New-TemporaryDirectory.

mklement0 commented 6 years ago

Thank you, @Liturgist.

I hear you re New-TemporaryDirectory (though there's the proud and confusing Unix tradition of calling any filesystem item a "file"....).

On the opposite end of the spectrum, you could argue (as someone already has, but I forget where) that there should be just one generic New-TemporaryItem cmdlet with -Type File and -Type Directory parameters (conceivably, each drive provider could implement their own temporary locations, though among the ones that ship with PowerShell, it really only makes sense for the FileSystem provider).

Either way, I suggest you make your voice heard at #3442.


@SteveL-MSFT: Thank you for the guidance, but despite my initial willingness to tackle this RFC, I'd like to bow out of this [self-]assignment.
I do hope someone else will take it on.

joeyaiello commented 6 years ago

After some discussion, @PowerShell/powershell-committee is leaning towards a cmdlet rather than a new variable namespace in order to provide better discoverability, tab-completion, etc.

Right now, the .NET static method works for doing this, but we want to eventually establish a better pattern with cmdlets. I forsee that these would live in a "compatibility" module, shipping on the Gallery, that works on 3/4/5, and that might eventually ship as part of 6.x.

We may have an internal implementation of these sorts of things floating around. Let me meet with some other MSFTies and circle back. For now, though, I think this should be moved out to 6.1.0

joeyaiello commented 6 years ago

Also, this is exactly the class of problems intended to be addressed by https://github.com/PowerShell/PowerShell-RFC/blob/master/1-Draft/RFC0019-PowerShell-Core-Interop-Module.md (written by @darwinJS). It would be awesome if more folks could provide feedback at PowerShell/PowerShell-RFC#68

Liturgist commented 6 years ago

@mklement0 - Thank you for your kind reply. If the cmdlet route is desired, why not add a -Temporary switch to New-Item? I will look for #3442.

mklement0 commented 6 years ago

@joeyaiello: Thanks for the pointer to the RFC and the attendant discussion.

leaning towards a cmdlet rather than a new variable namespace

I don't think the two are mutually exclusive.

To give an example it the context of this specific discussion:

It's both handy to have cmdlets that shield you from having to know a platform's location for temp. files (such as New-TemporaryFile) and, on occasion, to have the ability to determine that location explicitly (e.g., $ps:TEMP).
(Also, at least with respect to tab completion $ps: and $pspref: should work fine.)

@Liturgist: I like the idea. New-TemporaryFile has always felt like a kludge to me.

mklement0 commented 6 years ago

P.S., as an aside: While the ability to call .NET methods directly is always a great option to have (one that sets PowerShell apart from other shells), it should only be necessary for unusual scenarios, given the different syntax, the advanced knowledge it requires and that there be dragons.

mklement0 commented 6 years ago

P.P.S (last one - I pinkie-swear, @joeyaiello):

I earlier said that introducing namespaces such as $ps: and $pspref: without an underlying drive would be a departure from current namespace notation, suggesting that introducing such drives may be too heavy-handed.

On the flip side, if these new namespaces were backed by a drive, it would address the discovery concerns, because you could then run commands such as Get-ChildItem ps: and Get-ChildItem pspref: to discover all such variables.

DarwinJS commented 6 years ago

@mklement0 - another approach is a common prepend. I've done this to identify all variables that are part of my template code in a dump. Then you can do:

gci variable:PREPEND* 

A prepend would also avoid the necessary step of discovering a third namespace for variable data (after variable: and env: ) - hopefully before you go to the work of creating your own variables.

DarwinJS commented 6 years ago

We should probably move this conversation to the actual RFC so as not to lose these thoughts.

iSazonov commented 6 years ago

I'd rather $psvar instead $pspref. Also it is to discover descriptions - Get-Help $psvar:pstableversion. List all system variables Get-Variable -ListPowerShellEngineVariables.

iSazonov commented 6 years ago

With regard to temp. Why have we not discussed common scenarios? What is a temp file for? What is a temp directory for? Why does a script writer must to think about it? Can we exclude this? If we're explicitly using temporary directories we have to join/parse paths. Can we exclude this and use implicitly temporary directories?

Scenario 1:

  1. Create a temp file.
  2. Use the file.
  3. Forget the file - we expect that a system remove the file. If we look %temp% or %WINDIR%\temp we see tons undeleted files.

Conclusion 1: Explicit temp directory not needed. We should add an enhancement to remove our temp files.

Scenario 2:

  1. Create a temp directory for a scope (block/script/module/class instance).
  2. Create a temp file in the temp directory.
  3. Use the file.
  4. Forget the file and the directory - we expect that a system remove the file and the directory.

Conclusion 2: Explicit temp directory not needed. We should add an enhancement to remove our temp directories.

Scenario 3:

  1. Create a temp directory
  2. Extract an archive in the temp directory. Here we have to use the temp prefix.
  3. Process the files.
  4. Forget the files and the directory - we expect that a system remove the files and the directory.

Conclusion 3: Explicit temp directory possibly needed. We should add an enhancement to remove our temp directories.

Common conclusion.

  1. We should clean up our temp files/directories. Solution may be - PowerShell application creates a subdirectory in %temp% as ps-{New-GUID}, create temp files/directories in it and removes the directory at terminating time.
  2. We can see that only Scenario 3 requires explicit temp directory and joining with a temp prefix. Should we address the scenario? If yes a Pester-like solution looks great - we could based on *-Drive cmdlets to create scoped temp directories under %temp% and use them as a temp prefix. Also it open a way to introduce a scoped temps.
DarwinJS commented 6 years ago

Just a couple comments.

This thread has gotten far afield of the original request which is simply to have a platform agnostic method for determining the TEMP folder. Perhaps discussions about creating ephemeral temporary files and folders should be another RFC?

Second, here is the code I use for creating the TEMP variable and COMPUTERNAME variables on PowerShell Core. So far all my examples are for making the windows abstractions present on linux because I have a lot of code written for Windows.

Also, to know which variables to port across I basically try to judge what is VERY common in my scripts and exists as an entity on the linux side (e.g. hostname, temp), but is simply not exposed in the way my script expects.

In my mind the idea behind compatibility would be to facilitate adoption through [a] reuse of existing code, [b] ability to reuse familiar abstractions and [c] ability to write cross platform scripts.

I hope that [a] would not get overlooked with an overly greenfields perspective as it is this one that I've used the most.

If ((Test-Path variable:IsWindows) -AND !$IsWindows)
{
  If (Test-Path '/tmp')
  {
    write-output 'running on PowerShell Core, setting up TEMP environment variable'
    $env:temp = '/tmp'
    $env:computername = hostname
    $env:computername = ($env:computername).split('.')[0]
  }
  Else
  {
    Throw "Cannot find standard temp folder '/tmp' on non-windows platform."
  }
}
mklement0 commented 6 years ago

@DarwinJS: Yes, the thrust of your RFC is quite different, so I agree that what the discussion here drifted toward (despite its initial focus being just one piece of environment information) should be its own RFC.

In short:

The two approaches are not mutually exclusive.

My only concern with your specific approach is that you're creating environment variables on Unix platforms, which are seen by all child processes, even though the variables are only used PowerShell-internally (to restate a concern expressed in an earlier comment).

DarwinJS commented 6 years ago

I'm not sure I understand why it matters if child processes don't recognize environment variables they don't use - it's pretty standard fare for software and scripts to define custom environment variables that child processes - including operating system binaries - will ignore.

Overall I think it will help adoption if there is NOT a new data type within PowerShell to do basic things like identify the disk temp location. "Environment Variables" and "Language Variables" seem normal and familiar to both sides of the house.

If a namespace is decided on, I would not use "psenv" If you have to turn around and unexplain the universally obvious implication of the string "env" - it seems like a learning anti-pattern. Anytime disambiguation can be incorporated at the initial naming, it saves billions of verbal and written repetitions of the following qualifier:

"Dispite the implication of the name 'psenv', it has nothing to do with your operating system environment variables."

mklement0 commented 6 years ago

I realize that your polyfill hinges on defining environment variables, and given that neither TEMP nor COMPUTERNAME are POSIX-defined environment variables, and given that we're talking about an opt-in module, that's probably fine.

More generally, the concern is not about child processes ignoring environment variables, but about naming conflicts: different utilities using an environment variable of the same name for different purposes, unbeknownst to each other.

The namespace of environment variables is shared and not constrained in any way (except syntactically), and the only way to avoid collisions is by adopting naming conventions - a prefix such as PS being one way - but not a foolproof way.

Therefore, the more robust approach is to avoid use of environment variables altogether, if they're not strictly needed - which is the case here (each session can derive the values from the existing environment).

As for introducing what you call a new data type: Familiarity is great and helps those familiar with one old way of doing things, but it's not a great foundation for a shell aspiring to be inherently multi-platform - and especially a shell whose success is owed to doing things in a new, superior way.

To write multi-platform scripts, users need abstractions they can rely on, and, conversely, not using those abstractions will inform them that they're venturing into platform-specific territory (which may be perfectly fine, depending on the use case).

As for the namespace psenv: To me, the prefix ps suggests that it refers to a PowerShell-scoped environment, so it builds on the established meaning of env - which will obviously continue to exist as the namespace for bona fide environment variables.
That said, names are negotiable.

DarwinJS commented 6 years ago

Good points - I'm sure TEMP is ripe for conflict - so by being an optional "Win to Linux" compat module, script testing would hopefully reveal that.

Familiarity is not the only value is so called "old" ways - they also have the most ubiquitous implementation. And "new" ways can be over engineered - just last week I helped a colleague replace 13 lines of code they received from a large company with one line. The new code used the exceptionally over-complex and incompletely implemented "*ScheduledJob" CMDLets. If new implementations get overly focused on "design purity" they can be a step backwards. I'm not saying that is the case with proposing a new namespace - just that if "adoptability" is not forefront for a new implementation, it can end up being shelfware.

Another consideration with new name space is that if it is not back ported to PSH 5, then it will not be ubiquitous for quite some time. I'll still need to "figure out" if I can rely on the namespace for the code I write that needs to be PSH 5 / PSH 6 + multiplatform. Then I will need to custom code someting for PSH 5. Actually, Windows PSH 5 is probably the dominant use case for the foreseeable future.

But maybe this would just guide the implementation toward a module so that those that wish can run it on versions older than 6?

iSazonov commented 6 years ago

Our discussion splitted on three main directions:

  1. Implementing backward compatibility and portability of existing Windows scripts using TEMP.
  2. Creating new abstractions for TEMP to write multi-platform scripts.
  3. Improving the discoverability of PowerShell variables. #4394

@mklement0 Could you please open new Issue(s) (for 1. and 2. ) to summary the disscusion and close the Issue?

mklement0 commented 6 years ago

@iSazonov:

Re 1.: This is covered by @DarwinJS's existing RFC. Re 2: Before we resurrect that, I suggest we get clarity on the overall approach proposed in #4394.

@DarwinJS:

Another consideration with new name space is that if it is not back ported to PSH 5, then it will not be ubiquitous for quite some time.

I hear you, but keeping track of which feature was introduced when is an unavoidable challenge if you want to innovate. A back-ported module may indeed help, however.

I suggest we continue the conversation at #4394.

essentialexch commented 5 years ago

$env:tmp or $env:temp is the predominate use-case today.

promoting this to other platforms is trivial. whether you consider it interop or legacy is irrelevant. it supports the predominance of all existing scripts.

Jaykul commented 5 years ago

Short answer: [IO.Path]::GetTempPath() is the ONLY answer.

The idea that we need to protect people from learning about .NET is, frankly, silly. That static method is backwards compatible all the way to Windows PowerShell 1.0 and no matter how many improvements we add in the future, it's still the one and only correct answer to the original question today (over a year after it was asked and answered).

However, I like the idea that we should make this easier in the future, even though anything we come up with won't be reliably usable for years, because PowerShell Core 6 and 6.1 have already shipped without it.

Of all the ideas expressed above, the only idea I like (and I love it) is the idea of creating a Temp: drive and pointing it at (a new folder named after a GUID and/or date in) the location returned by [IO.Path]::GetTempPath() (and maybe cleaning it up when PowerShell exits). Creating a Temp drive is something we can easily do in both PowerShell Core and in a compatibility module for already released versions. It's very unlikely to cause a problem (it's just a question of whether anyone is likely to have created a Temp: drive) and if it does, fixing it in the script where that problem occurs would be trivial.

Obviously we could also add New-TempDirectory to the New-TempFile command -- but that should be done in an external module to start with so that it can be inherently cross-platform and backwards compatible.

LawrenceHwang commented 5 years ago

While I like the simplicity and consistency (a lot) with [IO.Path]::GetTempPath(), it is imperative to enable a PowerShell-Style option for PowerShell users. For this case, I dig creating a Temp: drive with potentially related command.

As a PowerShell user, I have always appreciated in PowerShell being closer to meaningful and plain English. By NOT providing that option and implying people should learn .NET is going to erode people's trust.

essentialexch commented 5 years ago

Requiring an admin-scripter to use [IO.Path]::GetTempPath() is obscene. It's not discoverable, it's not obvious, and it's not powershell.

I personally believe that promoting $env:tmp is the right answer, but someone could argue it's also not powershell. After some personal consideration, I think a new cmdlet Get-TemporaryDirectory (or a similar name) is the proper answer.

I don't understand "New-TempDirectory". The normal use-case is not creating a directory (which New- indicates) - it's retrieving the name of an existing directory (which Get- indicates).

I'm "meh" about a temp: drive because, for it to be discoverable, it also requires one or more cmdlets. Instead of offering more "freedom" to the scripter, a temp: drive feels constraining.

iSazonov commented 5 years ago

Another look to be completely PowerShell native. Can we avoid the explicit use of temporary directories?

# Create the file in temporary directory
$temp= New-Item -Path xyz.txt -Temporary

# Get the file and copy it to temporary directory
$var2 = Get-Item -Path sample.txt; $tempVar2 = Copy-Item $var2 -Temporary

# The temporary directory is automatically created and dropped for the scope 
DarwinJS commented 5 years ago

I feel it should simply abstract away the existing platform with the longest standing powershell platform's references being preferred so that the long standing code can be more easily reused.

Like this:

If (!(test-path env:temp)) 
{
 If (test-path env:tmpdir) {$env:TEMP = $env:tmpdir}
 elseif (test-path '/var/tmp') {$env:TEMP = '/var/tmp'
}