Closed colejohnson66 closed 4 days ago
The same result:
-join ([byte[]][char]0x200f)
PowerShell allows passing [char]
to [byte]
-typed parameters, and therefore also to [byte[]]
-typed parameters.
In the case of [byte[]]
, the unexpected array-allocation behavior occurs, namely if the code point of the character is > 0xFF (255), i.e. doesn't fit into a single [byte]
.
The bug also surfaces with PowerShell-native commands - even with casts, as @237dmitry as discovered - so the simplest repro with a parameter is:
# OK -> 1, 255
# That is: One byte with value 255
& { param([byte[]] $bytes) $bytes.Length; Write-Host $bytes } ([char] 255)
# !! BROKEN, due to the character's code point being > 255 (0xFF)
# !! -> 256(!) bytes with value 0
# !! That is, values >= 256 are mistakenly interpreted as the size of an array of 0 bytes to allocated.
& { param([byte[]] $bytes) $bytes.Length; Write-Host $bytes } ([char] 256)
By contrast, note that the behavior is correct in the following cases, resulting in a statement-terminating error with message "Value was either too large or too small for an unsigned byte" if [char]
code point or number > 255 is passed:
with [char]
for [byte]
-typed parameters (non-arrays), and if you pass an array of [char]
values to a [byte[]]
-typed parameter
with any instance of a numeric type, both in the [byte]
and [byte[]]
cases
The same result:
-join ([byte[]][char]0x200f)
Well that's plain odd. Granted most assumptions about chars converting to 7 or 8 bits have been broken since unicode arrived, but even so it's madly inconsistent.
> ([char]8364)
€
>$x = ([char]8364)
> [int[]]$x
8364
> ([byte[]]$x).count # it's that number of zeros
8364
> ([byte[]]'€').count # String literal fails
InvalidArgument: Cannot convert value "€" to type "System.Byte[]".
([byte[]][char]'€').count # String cast to a char (same as $x) gives that number of zeros
8364
> ( [byte[]]::new(8364) ).count # which is a bit like new() for a an array of byte
8364
( [byte[]]::new([char]'€') ).Count # and works with a char
8364
> ( [byte[]]::new('€') ).Count # but not with a string literal
MethodException: Cannot convert argument "0", with value: "€", for ".ctor" to type "System.Int32":
([byte[]][char[]]'€') # Single char gives that number of zeros, but array fails
InvalidArgument: Cannot convert value "€" to type "System.Byte".
> ([byte[]]8364) # and single int fails
InvalidArgument: Cannot convert value "8364" to type "System.Byte[]".
One of the conversion paths PowerShell utilizes is trying to pass the conversion subject as the first argument to a constructor of the target type.
So while it's definitely surprising, and one of the more wacky examples, it does fit within PowerShell's conversion logic. It ends up generating something sorta close to this:
[Convert]::HexString([byte[]]::new([int][char]0x200F))
Interesting, so[char]
is allowed to bind to any numeric parameter, not just [byte]
.
I suspect the ship has sailed, but:
In general:
While supporting implicit [char]
-to-[byte]
conversion is understandable, I wonder what the benefit is of allowing something like the following, especially given that use of [char]
requires a deliberate act in PowerShell:
# -> 8364
& { param([double] $foo) $foo } ([char] 0x20ac)
As for the specific case at hand:
You could argue that an array-typed parameter should never attempt a constructor call like that. In fact, it is sensibly not attempted when you pass a number directly:
# -> "Cannot convert value "8364" to type "System.Byte".
# Error: "Value was either too large or too small for an unsigned byte.
& { param([byte[]] $foo) } 0x20ac
In other words: an array-typed parameter that isn't given an argument of the exact type should only ever:
An inability to convert the argument / at least one argument element should result in a (statement-terminating) error.
Interesting, so
[char]
is allowed to bind to any numeric parameter, not just[byte]
.
It's more just that char
> int
is a valid conversion path. Just like how string
can also convert to int, so this does the same:
[byte[]]'10000'
But direct answer, the conversion path for char
> int
is a very common way of getting the numeric value for a character. Might actually be my most used conversion interactively. Double is less clear of a case though for sure.
char
>int
is a very common way of getting the numeric value for a character.
Good point.
However, I think the point about the inappropriate array-constructor call in the case at hand is still worth addressing. There's no benefit to the current behavior, and it is only likely to cause confusion.
However, I think the point about the inappropriate array-constructor call in the case at hand is still worth addressing. There's no benefit to the current behavior, and it is only likely to cause confusion.
So to be clear it's a generic conversion path, not just for array types. For instance [IO.FileInfo]'some path'
only works because of the "try passing it as an arg to a constructor" conversion path.
I'll definitely grant you that in a vacuum there's no benefit to the specifics of how that conversion plays out, but that's an inevitability with how permissive the system is.
Understood about the generic conversion path, @SeeminglyScience, but I'm still baffled by the following discrepancy:
# Sensibly FAILS - 0x20ac is only interpreted as [byte] - no fallback to constructor.
[byte[]] 0x20ac
# !! Unexpectedly equivalent to [byte[]]::new(0x20ac)
[byte[]] [char] 0x20ac
Note that the latter seemingly works only with [char]
, whereas any numeric cast - e.g. [byte[]] [long] 0x20ac
- sensibly fails too.
Ahh okay so there's an existing exclusion in that rule for numeric types that gets circumvented when a non-numeric type could be cast as one. That's sort of the issue with hard coding exclusions (or adding to existing exclusions) the rules get really hard to reason about.
I'll open it up to engine to discuss, but either way I think the implementation would make it challenge without just hard coding a list of every type that is convertible to int.
The Engine WG discussed this and agree that the behavior is confusing, but decided that adding additional hard coded exclusions here is not worth the extra potential confusion and/or breaking changes.
This issue has been marked as won't fix and has not had any activity for 1 day. It has been closed for housekeeping purposes.
📣 Hey @colejohnson66, how did we do? We would love to hear your feedback with the link below! 🗣️
🔗 https://aka.ms/PSRepoFeedback
Prerequisites
Steps to reproduce
If one invokes a method that takes an array, passing a numeric type should fail. Instead, an array of that length is implicitly allocated and passed in.
For this issue,
Convert.ToHexString
has three overloads:byte[]
,byte[]
plus offset+length, andReadOnlySpan<byte>
. When called with a numeric type (such as a character), PowerShell implicitly casts the numeric value to an array instead of failing.See: dotnet/runtime#101757
Expected behavior
Actual behavior
Specifically, 16414 zeros (or
0x200F*2
) are printed.Environment data