dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.02k stars 4.03k forks source link

Allow 'uint', 'nuint' (and possibly 'nint') for 'stackalloc' statements #44535

Open john-h-k opened 4 years ago

john-h-k commented 4 years ago

Currently the only supported size operand of a stackalloc T[size] is an int. This is frustrating, particularly in interop scenarios where uint and nuint are your primary types. stackalloc takes an unsigned integer in IL anyway, so functionally there will be no difference

333fred commented 4 years ago

@cston moving this to roslyn because it seems like a bug, not a feature request. Was it intentional?

YairHalberstadt commented 4 years ago

Given that using variable length stackalloc is usually advised against (see https://vcsjones.dev/2020/02/24/stackalloc/. Linus Torvalds banned them from Linux for similar reasons) , making them easier to use is probably a non goal?

john-h-k commented 4 years ago

Given that using variable length stackalloc is usually advised against (see https://vcsjones.dev/2020/02/24/stackalloc/

I mean, simply validate your input, that is what I do

YairHalberstadt commented 4 years ago

A) If you validate your input, then converting to an int should be trivial. B) validating your input effectively means limiting the maximum size of your array to a small constant. In which case it's not necessarily any faster to make it variable length, since VLAs have runtime overhead.

tannergooding commented 4 years ago

For reference: https://sharplab.io/#v2:EYLgtghgzgLgpgJwDQxASwDYB8ACAmARgFgAoHAZgAIBXAOyggDM5L9KBhSgb1Mr9ao4CANkppaMAFSUAsgApxMSgA8AlF0q9+2xdIiUAvJVgQAxgGsIGDAHtTYiQG1lAXQDcW7XxwB2ShA8SbQBfT00g/gpWEQcpWTlqRRV1cK9+XX9DYxgzS2s7WOd3MO1ff0CQsLCooVEM+VoktQ0S9Ik9LJMLK1t7RSKKtNY/ALDQiL5qwRj6uVpEiWSWia8M/SMuvN7C10G0stGV8e0p6LrFgDkFdv8kWOTDAD5/AaqVmpnL67iIO4WlNRPF67N6labnJRXNZ3RqLQEGZ4QV4rE7vcH3KE3X6UeZNVRApEglanWqFFyUADy30c5OxeKBtDgAHcdsViWizmTKdTaX96QicczWXtUpF0f1yVSJbccfznoyWf0iUNRd5xU5JTyZbi4fiBQrhaCxZykgAFLV03UE5GosEmxbm6XY/4PAWEtm242ks0WmFy4EeoYkz5KR0a7Uu+GI5HBIA==

int, uint, nint, and nuint all work for pointer indexing, array creation, and array indexing. uint, nint, and nuint don't work for stackalloc. uint has always been blocked.

localloc in the JIT actually expects a nuint or uint and if you were to create a stack space over 2GB in size and you passed in say -2147483648, it would actually round to 2147483648 and succeed. Creating a greater than 2GB stack space isn't currently possible in managed code as new Thread is limited to int.MaxValue, but it is possible in native land in which case you can use a callback on that thread to execute a managed method which does this 😄

An example program which shows this: https://gist.github.com/tannergooding/13bd296687321390638792ca4099a21f

Starting native thread with 17179869180 bytes stack space In native run thread, allocating -2147483648 bytes... Stackalloc address for bytes: 715308661664

ghost commented 4 years ago

stackalloc is a very handy tool to have in scenarios where performance matters and anyone who has a reason to use it will most likely only consider having to cast uint, nint and nuint to int nothing more than an annoying inconvenience.

As for variable length stack allocation I don't think that the .NET Core JIT optimises constant size stackalloc expressions so they shouldn't be any faster than variable size stackalloc. If anything they could improve performance by reducing the amount of stack memory you're zeroing (unless something like SkipLocalsInitAttribute is used in which case the memory isn't zeroed).

john-h-k commented 4 years ago

A) If you validate your input, then converting to an int should be trivial.

I never claimed doing it was difficult, it is just repetitive, frustrating, and useless.

B) validating your input effectively means limiting the maximum size of your array to a small constant. In which case it's not necessarily any faster to make it variable length, since VLAs have runtime overhead.

But then I must slice my span, which itself has a cost

YairHalberstadt commented 4 years ago

As for variable length stack allocation I don't think that the .NET Core JIT optimises constant size stackalloc expressions so they shouldn't be any faster than variable size stackalloc.

Oh yes it does. Compare asm for:

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACMhgYQYG8aHunHiUGAWQAUASg5ceUgMoAHbADsAPAEsFGAHwNc8hQwC82jNjABrbABsLEMAzUYA2qQCsSALoBuSVO4AzaDAmABYMwrC+DABu2FAMCHZ6Ooqi3j5S8YbkXtRSAL40uUA===

vs

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABAJgEYBYAKGIGYACMhgYQYG8aHunHiUGAWQAUASwB2GBrlEAvGAEoOXHqoDKAB2ziAPBIwA+aVvEMAvNIzYwAa2wAbexDAN9AbRnyAugG4Vq7gAzaBhrAAsGYVhAhgA3bCgGBFdTXBMFfwDVZItyP2pVAF8aQqA==

YairHalberstadt commented 4 years ago

I never claimed doing it was difficult, it is just repetitive, frustrating, and useless.

Consider:

if (size < 0 || size > 256)
    throw new ArgumentOutOfRangeException(size);
Span<T> span = stackalloc T[(int)size];

And you think that removing the cast to int is the bit that's annoying enough to make this worth a language change?

john-h-k commented 4 years ago

I do

AssertCanStackalloc<T>(size);
var p = stackalloc T[size];

in my use case

And yes, I find the unnecessary cast annoying

YairHalberstadt commented 4 years ago

Maybe change it to

var p = stackalloc T[AssertCanStackalloc<T>(size)];

?

And it will do the annoying cast for you.

john-h-k commented 4 years ago

Personally, I dont't like that. I find it unclear and cryptic. I like the

CheckICanDoIt()
DoIt()

pattern, and it appears several others do to. And this pattern can be made cleaner with minimal effort and no breaking changes by allowing uint 😄

YairHalberstadt commented 4 years ago

If this is a bug, this is a bug. However if this is to spec, given that

a) VLAs are risky, a vector for security bugs, and of arguable benefit. Clearly of low enough total benefit that they were dropped from the C standard and removed from Linux. b) There are trivial ways around the issue. You may not like them, but they are trivial. c) stackalloc usually appears in performance critical code, and VLAs only appear in performance critical code. The language team has often indicated that if you go low level you will lose some niceties. d) language changes are expensive.

I would suggest the benefits of making this change don't seem compelling.

john-h-k commented 4 years ago

d) language changes are expensive.

In what way is this specific change expensive?

john-h-k commented 4 years ago

if (size < 0 || size > 256)

With uint this is not needed, and you can just do size > 256.

john-h-k commented 4 years ago

I, and others I have talked to, strongly dislike the behaviour of having a signed type that is actually unsigned, and so allocating -1 bytes is actually allocating uint.MaxValue bytes

tannergooding commented 4 years ago

I, and others I have talked to, strongly dislike the behaviour of having a signed type that is actually unsigned, and so allocating -1 bytes is actually allocating uint.MaxValue bytes

Well, realistically speaking even though it is possible (and I provided a working proof of concept: https://github.com/dotnet/roslyn/issues/44535#issuecomment-633262853), it is highly unlikely that anyone will ever actually try to allocate int.MinValue bytes, let alone be in a scenario where it could succeed.

In any case, I think uint, since it has been blocked forever is likely not a bug. nint/nuint might be worth considering as part of the nint language feature but would be up to LDM and the feature author.

There are lots of other nits where the language deals with signed types but where the runtime is dealing with unsigned. sizeof is another example and they all add up and force casts to introduced in code that is already dangerous or hard to read/understand. I would definitely like to see those improved (possibly via target typing in some cases and preferring int to preserve back-compat wrt overload resolution), but its also not a blocking scenario and its something that will never be "perfect".

popcron commented 1 month ago

is this issue still open for real? is it something that is still looked into/thought about? or has it been thrown under the rug?

@tannergooding

it is highly unlikely that anyone will ever actually try to allocate int.MinValue bytes

imo no disrespect but the arguments you present aren't very healthy. the tool to enforce this by design exists and is instead excused with a "well who would want to do it anyways?". its wild, just require uint. i understand we might really like backwards compatibility here as well, but these shouldn't/wouldn't ideas have their limits... benefiting the short term at the cost of long term.

i would still love to see uint being at least an option for span indexing (Range too). cheers!

tannergooding commented 1 month ago

It's not just excused with "who would want to do it anyways". It's dismissed for many reasons, including that systems don't provide stack spaces that are multiple gigabytes in size, rather they default to around 2-16mb. While you can increase this manually when you create a new thread, doing so can also hurt performance and work against hardware level optimizations that assume a certain amount of stack space, a certain stack depth, and a maximum distance between key stack metadata.

Computers do not have increased RAM so that singular allocations can be bigger. They have increased RAM to improve throughput across many concurrent applications and cores/threads, essentially being used as a 4th layer of cache for the data you most immediately need to work with across the entire system. Fast applications, especially those that scale, work by treating it like a cache, and ensuring that data is streamed in from disk in amounts and at a frequency that can be worked with. They distribute workloads and use many medium sized allocations to improve throughput, and then page out the memory or overwrite it with the next batch of data when they're done.

We will not see uint being used in Range for similar reasons, it would hurt performance. Both due to how range is internally implemented for efficiency, but also because there are other problematic scenarios that come into consideration. Just because a value only accepts unsigned inputs does not mean using an unsigned type is the correct choice. There is a lot that goes into determining what is correct or not, and modern languages are more and more choosing signed types because of the benefits they have, the reduced chances of bugs, the availability of a clear overflow/underflow sentinel value, giving correct results when doing basic arithmetic, etc.

There isn't a real long term benefit here, at all. There is a small convenience factor for some cases of interop or more complex/unsafe lowlevel code, and many downsides to real world usage otherwise, including many that might hurt performance.

popcron commented 1 month ago

Mine and others reasoning is to use a data type that is more appropriate with the constraints. If the argument was for how much larger a stackalloc could be as opposed to with int, I don't think we would be bothering with this. Size of int.MaxValue is just as extreme.

What you describe for range however sounds fair (as it also performs x - y code internally to fetch lengths).

Without turning this into a ping pong table, it sounds like this should be closed, as I still disagree. Thank you for the response

CyrusNajmabadi commented 1 month ago

is this issue still open for real?

Yes.

It's just very low priority as there is an incredibly trivial workaround available that is suitable for all practical purposes.