dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.

https://docs.microsoft.com/dotnet/core/

MIT License

14.9k stars 4.63k forks source link

Add 64 bits support to Array underlying storage #12221

Open GPSnoopy opened 5 years ago

GPSnoopy commented 5 years ago

While System.Array API supports LongLength and operator this[long i], the CLR does not allow arrays to be allocated with more than 2^31-1 elements (int.MaxValue).

This limitation has become a daily annoyance when working with HPC or big data. We frequently hit this limit.

Why this matters

.NET arrays is the de facto storage unit of many APIs & collections.
Currently one has to allocate native memory instead, which does not work with managed code (Span could have helped, but it's been limited to int32 as well).
HPC frameworks expects data to be contiguous in memory (i.e. the data cannot be split into separate arrays). E.g. BLAS libraries.
Requires applications to be coded and designed differently if they handle <2G elements or >2G elements. I.e. It does not scale.
It's an arbitrary limitation with little value to the user and application:
- With a desktop with 64GB, why can one only allocate 3.1% of its memory in one go?
- With a data centre machine with 1,536GB, why can one only allocate 0.1% of its memory in one go?

In C++ this is solved with std::size_t (whose typedef changes depending on the target platform). Ideally, .NET would have taken the same route when designing System.Array. Why they haven't is a mystery, given that AMD64 and .NET Framework appeared around the same time.

Proposal I suggest that when the CLR/JIT runs the .NET application in x64, it allows the array long constructor to allocate more than int.MaxValue items:

Indexing the array with operator this[long i] should work as expected and give access to the entire array.
Indexing the array with operator this[int i] should work as expected but implicitly limit the access to only the first int.MaxValue elements.
The LongLength property should return the total number of elements in the array.
The Length property should return the total number of elements in the array, or throw OverflowException of there are more than int.MaxValue elements (this matches the current behaviour on multi-dimensional arrays with more than int.MaxValue elements).

I naively believe that the above should not break any existing application.

Bonus points for extending 64-bit support to Span and ReadOnlySpan.

huoyaoyuan commented 3 years ago

If Array.Length became nint, most of the code out there would still compile and work out of the box (maybe some warnings though if the loop is using int instead of nint?).

Array.Length is always nint in IL, but C# compiler see it as int and always inserts a cast. This makes detecting int based loop very annoying.
A plain foreach loop is lowered into int based loop. The "problematic" int based loop is unavoidable now.

GPSnoopy commented 3 years ago

Thank you @GrabYourPitchforks for exposing your experience with user feedback from Microsoft point of view.

I'd like to share some of our experience. We have a C# codebase of ~5 millions lines of code. After at least two years of effort, we are close to having everything ported to .NET Core. It definitely was not a painless experience, most of the problems we encountered being with solutions, projects, build dependencies and project dependencies.

Overall I do think it was worth it. In the long run, with .NET 3.1 and 5.0 it feels that all the breaking changes you decided to make are a net positive. You will always have people complaining about the short term effort they have to spend when making breaking changes, so I wanted to highlight that we do appreciate the long term benefits of doing so.

Regarding 64-bit arrays, what would you object to if they were made opt-in like nullables (with an explicit annotation for forcing the compiler to pass the array to a callee that is not 64-bit array-aware - like nullables do)? As I said before, we only care about a few dependencies to be compatible - most of them don't matter.

It feels to me like a nice way to start the transition without disrupting people who don't care while we still have time. It also opens the door to doing the same with strings and various collections.

Ultimately the 32-bit array problem is only going to get worse. While it might only be a few niche scientific and big-data apps today, I feel like they are actually acting as canaries in the mine as to where future applications are heading towards to. When I started using .NET, 2GB was considered to be a obscene about of memory to be used by an application. Not so much now.

Funny thought / analogy: if we were to compare the 32-bit array limitation to the DOS 64K segment limitation, then on this machine (R5950X 128GB) it's equivalent to only being able to allocate 64K segments on a 4MB DOS machine (if using byte[]) and 1MB DOS machine (if using float[]). In hindsight, I think it was good to move to Windows 95 and NT with full 32-bit addressing support. Even though it probably pissed off a lot of customers by breaking their stuff.

verelpode commented 3 years ago

Ultimately the 32-bit array problem is only going to get worse. While it might only be a few niche scientific and big-data apps today, I feel like they are actually acting as canaries in the mine as to where future applications are heading towards to.

Yes, that's how it happened in the past. Originally normal was 16-bit, and 32-bit was "NT" - "New Technology" - "Windows NT", but later this advanced "NT" version of Windows because the normal version of Windows, and the previous "normal" version of Windows was discontinued. Later 64-bit pointers were introduced, again this was initially only for advanced scenarios, but then later, today, 64-bit pointers and the x86-64 processor architecture are the normal version and x86-32 is obsolete and mostly discontinued. Even for small consumer-customers with small desktop computers, 64-bit Windows is installed, and all apps can and should simply run in the x86-64 mode regardless of whether they need to exceed the 2GB/4GB barrier.

Today even the tiny basic Notepad.exe runs as 64-bit:

Even though Notepad.exe doesn't need 64-bit, it runs as 64-bit anyway, and this is good and right. It's the right way to run every process in the entire computer in x86-64 mode (other than a small number of exceptions for backwards-compatibility reasons).

In the same manner as the past events, a ".NET Scientific Edition" with 64-bit array and collection lengths would be the advanced niche version only for a limited number of years. After some years, the "scientific" edition would become the normal edition, and today's "normal" edition would be discontinued. This has been the recurring pattern in the past and I see no reason why this pattern would end now.

Thus a "scientific" edition serves as the gateway to the new normal in the future. It also helps ease the transition without disrupting customers who don't need or want 64-bit array/collection lengths suddenly foisted upon them even when they have no use for it.

It feels to me like a nice way to start the transition without disrupting people who don't care

Yes, in my opinion, the transition can be started and carried out in the same manner as the pointers were transitioned from 32 to 64 bits successfully. The recommended practice with the pointers was to stop doing anything that made them incompatible with 64 bits, such as old C++ code that typecast a pointer to int32 or stuffed the pointer into a 32-bit integer storage location in RAM, etc.

The same principle can be adopted with collection ordinals (indexes, offsets, lengths, counts). Adopt the mindset of always saying/thinking, "This is a collection ordinal; it should work regardless of whether it is 32 or 64 bits; don't do anything that breaks the ordinal if it is 64-bit."

It also opens the door to doing the same with strings and various collections.

Yes, although the BigArray<T> example that I posted is nice to have in terms of something that is immediately usable today, it's still only a workaround; a temporary stop-gap measure in the meantime. Any kind of Big/LargeArray<T> class or struct isn't a proper solution in the long-term. It's not the right design for the future.

It's like you said, this issue isn't only about big arrays. Ultimately the future demands 64-bit ordinals for all types of collections, including IList<T>, List<T>, HashSet<T>, System.String, etc.

For example, today NETFW 5.0 defines:

public interface IReadOnlyCollection<out T> : IEnumerable<T>
{
    int Count { get; }
}

public interface IReadOnlyList<out T> : IReadOnlyCollection<T>
{
    T this[int index] { get; }
}

But that's no good for the future. It's not future-proof. It's not modern anymore. It needs to be changed to:

public interface IReadOnlyCollection<out T> : IEnumerable<T>
{
    OrdinalType Count { get; }
}

public interface IReadOnlyList<out T> : IReadOnlyCollection<T>
{
    T this[OrdinalType index] { get; }
}

Where OrdinalType is either:

A simple using alias like I demonstrated previously, OR
An actual runtime-visible type similar to how System.IntPtr is implemented, OR
A 64-bit-compatible redesign of the controversial (arguably defective) System.Index and System.Range.

In the same manner as how software engineers adopted the practice of making pointers work correctly regardless of whether they were compiled as 32-bit or 64-bit, there should be a practice of writing collection ordinals to work correctly regardless of whether they are compiled as 32-bit or 64-bit, and this starts with writing an OrdinalType instead of int or Int64 everywhere that ordinals are used, including the collection interfaces, not only System.Array.Length.

If OrdinalType is a simple using alias, then it remains 100% compatible with todays "normal" NETFW 5.0, and it breaks nothing, yet it also allows a 64-bit scientific edition to be compiled, thus starting the necessary transition in a non-disruptive, stable, reliable manner. Yes there is a downside of having to recompile dependencies, but there is always going to be a price to pay, and a perfect easy solution doesn't exist. Something has to be done otherwise the issue will never be solved; otherwise the transition will never start.

huoyaoyuan commented 3 years ago

Later 64-bit pointers were introduced, again this was initially only for advanced scenarios, but then later, today, 64-bit pointers and the x86-64 processor architecture are the normal version and x86-32 is obsolete and mostly discontinued. Even for small consumer-customers with small desktop computers, 64-bit Windows is installed, and all apps can and should simply run in the x86-64 mode regardless of whether they need to exceed the 2GB/4GB barrier.

You are touching the compatibility hell. For large systems, ecosystem compatibility is more painful than technical debt. Windows takes huge effort for WIn16OnWin32 and Win32OnWin64 layers.

huoyaoyuan commented 3 years ago

My consideration of migration:

public class List<T>
{
    public int Count { get; }
    public nint Count { get; }
}

The int and nint version exists simultaneously, and the compiler is told to prefer nint version. Old code using the int version will be binary compatible and just work, and throw at runtime of the collection is larger than int range.

The change breaking is still huge. .NET is always keep binary compatibility, not only source compatibility. Already compiled assemblies are a part of the ecosystem. Project for newer framework should be very easy to use binaries compiled for old framework.

tannergooding commented 3 years ago

I think there are other considerations (namely the GC) that aren't being accounted for here as well.

For example, the GC considers 85kb large today. Once you start getting into the realm of megabyte or gigabyte sized allocations, it becomes prohibitively expensive to move them and its likely better to just assume that they can't move at all and to work around them. Likewise, if these large allocations contain reference types, you are introducing many additional references for the GC to track and this does start becoming measurable in simple console apps. There would likely need to be extensive input from Maoni and the GC team before arrays with reference types that large were even considered.

However, if your data only needs to meet the unmanaged constraint (is not a reference and does not contain references), then you are likely better off bypassing the GC altogether and just using native allocations. The major downside is that Span<T> also uses int instead of nint (which was released in C# 9) and so you can't "support" larger native allocations in the same API and we didn't normalize accepting them when moving algorithms to be span based back in .NET Core 2.1.

verelpode commented 3 years ago

@huoyaoyuan

The int and nint version exists simultaneously, and the compiler is told to prefer nint version.

That's a HUGE amount of work. I don't think you'll be able to convince the .NET Team to do such a huge amount of work. You're describing a solution that requires the team to write hundreds or thousands of new methods and properties. In every place where a preexisting method or property uses an ordinal/index/count/length, you're saying that the team should write and maintain 2 versions of the same method or property:

The original method or property with int.
An overload of the same method with nint.

That's a huge number of duplicated methods and properties. For example, for starters, if you just look at System.Array alone, following is a list of all the methods that would be duplicated. That's already a whole bunch, and the list of duplicate methods grows much larger when include all the other classes and interfaces that would also require all of their ordinal-using methods and properties to be duplicated.

Even if the team successfully creates these hundreds or thousands of duplicated methods and properties, other problems still exist, such as @GrabYourPitchforks wrote: "unintended consequences throughout the GC that would hinder its efficiency, even when collecting normal fixed-size objects."

Here are the methods to be duplicated in System.Array alone, excluding List<T>, ICollection<T>, etc etc etc:

public int Length { get; }
public static int BinarySearch<T>(T[] array, int index, int length, T value);
public static int BinarySearch(Array array, int index, int length, object? value, IComparer? comparer);
public static int BinarySearch<T>(T[] array, int index, int length, T value, IComparer<T>? comparer);
public static int BinarySearch(Array array, int index, int length, object? value);
public static void Clear(Array array, int index, int length);
public static void ConstrainedCopy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length);
public static TOutput[] ConvertAll<TInput, TOutput>(TInput[] array, Converter<TInput, TOutput> converter);
public static void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length);
public static void Copy(Array sourceArray, Array destinationArray, int length);
public static Array CreateInstance(Type elementType, int length);
public static Array CreateInstance(Type elementType, int length1, int length2);
public static Array CreateInstance(Type elementType, int length1, int length2, int length3);
public static Array CreateInstance(Type elementType, params int[] lengths);
public static Array CreateInstance(Type elementType, int[] lengths, int[] lowerBounds);
public static void Fill<T>(T[] array, T value, int startIndex, int count);
public static int FindIndex<T>(T[] array, Predicate<T> match);
public static int FindIndex<T>(T[] array, int startIndex, int count, Predicate<T> match);
public static int FindIndex<T>(T[] array, int startIndex, Predicate<T> match);
public static int FindLastIndex<T>(T[] array, int startIndex, int count, Predicate<T> match);
public static int FindLastIndex<T>(T[] array, int startIndex, Predicate<T> match);
public static int FindLastIndex<T>(T[] array, Predicate<T> match);
public static int IndexOf<T>(T[] array, T value, int startIndex, int count);
public static int IndexOf<T>(T[] array, T value, int startIndex);
public static int IndexOf(Array array, object? value);
public static int IndexOf(Array array, object? value, int startIndex, int count);
public static int IndexOf(Array array, object? value, int startIndex);
public static int IndexOf<T>(T[] array, T value);
public static int LastIndexOf(Array array, object? value, int startIndex);
public static int LastIndexOf<T>(T[] array, T value, int startIndex, int count);
public static int LastIndexOf<T>(T[] array, T value, int startIndex);
public static int LastIndexOf<T>(T[] array, T value);
public static int LastIndexOf(Array array, object? value, int startIndex, int count);
public static int LastIndexOf(Array array, object? value);
public static void Resize<T>([NotNull] ref T[]? array, int newSize);
public static void Reverse<T>(T[] array, int index, int length);
public static void Reverse(Array array, int index, int length);
public static void Sort<T>(T[] array, int index, int length, IComparer<T>? comparer);
public static void Sort(Array keys, Array? items, int index, int length);
public static void Sort(Array keys, Array? items, int index, int length, IComparer? comparer);
public static void Sort<T>(T[] array, int index, int length);
public static void Sort(Array array, int index, int length);
public static void Sort<TKey, TValue>(TKey[] keys, TValue[]? items, int index, int length);
public static void Sort<TKey, TValue>(TKey[] keys, TValue[]? items, int index, int length, IComparer<TKey>? comparer);
public static void Sort(Array array, int index, int length, IComparer? comparer);
public void CopyTo(Array array, int index);
public IEnumerator GetEnumerator();
public int GetLength(int dimension);
public long GetLongLength(int dimension);
public int GetLowerBound(int dimension);
public int GetUpperBound(int dimension);
public object? GetValue(int index);
public object? GetValue(int index1, int index2);
public object? GetValue(int index1, int index2, int index3);
public object? GetValue(params int[] indices);
public void SetValue(object? value, params int[] indices);
public void SetValue(object? value, int index1, int index2, int index3);
public void SetValue(object? value, int index1, int index2);
public void SetValue(object? value, int index);

You are touching the compatibility hell. For large systems, ecosystem compatibility is more painful than technical debt. Windows takes huge effort for WIn16OnWin32 and Win32OnWin64 layers.

Fortunately there's no need to create a "WOW" equivalent in the case of 64-bit ordinals. Although the WOW was created for the 32-to-64 pointer transition, I don't propose creating any WOW equivalent for a 32-to-64 ordinal/index transition.

tannergooding commented 3 years ago

That's a HUGE amount of work. I don't think you'll be able to convince the .NET Team to do such a huge amount of work.

Adding arrays with native sized length is a huge amount of work no matter how you look at it.

If you update the existing array type to be nint every algorithm that uses an array or 32-bit indexing now also needs to be audited. Every for (int i = 0; i < a.Length; i++) loop needs to be fixed up and every API that uses things like Span<T> or Unsafe.AddRef, etc needs additional scrutiny.

verelpode commented 3 years ago

@tannergooding

I think there are other considerations (namely the GC) that aren't being accounted for here as well. [...] There would likely need to be extensive input from Maoni and the GC team

Yes, I believe a ".NET Scientific Edition" would need to use a variation of the GC that has:

Different hard configuration constants that are set at compile-time.
A few internal methods of the GC that are implemented differently depending on whether they're being compiled for the normal edition or sci edition.

#if SCIENTIFIC_EDITION
private const int LargeObjectHeapThreshhold = 85000 * 10;
#else
private const int LargeObjectHeapThreshhold = 85000;
#endif

Thus a "scientific edition" cannot be only different DLL's/framework; it would also require a different variation of the CLR.

Every for (int i = 0; i < a.Length; i++) loop needs to be fixed up

Roslyn has many helpful analyzers that produce warnings and automated changes. I suggest that one more of these be added to Roslyn. The new warning could tell us that we should change this...

for (int i = 0; i < a.Length; i++)

...to this:

for (var i = 0; i < a.Length; i++)

and then VS would prompt us to let VS change it from int to ~~var~~ OrdinalType for us, in the same manner as how the other Roslyn analyzers work today. [This was wrong, it shouldn've been OrdinalType i not var i]

tannergooding commented 3 years ago

This code does not do what you think it does: for (var i = 0; i < a.Length; i++)

This is still int i = 0; i < a.Length; i++ and preserves the "bug" that the index is int while the length could be nint. You have to explicitly change it to be nint i = 0; i < a.Length; i++

huoyaoyuan commented 3 years ago

I have to point out that currently foreach (var x in a) is translated into for (int i = 0; i < a.Length; i++) for arrays.

tannergooding commented 3 years ago

foreach does get lowered to for for cases like array and assuming that array started having native length, I'm sure the compiler could likewise be tweaked and account for that. However, it is not suitable for all scenarios, particularly where an index is needed.

verelpode commented 3 years ago

I failed to explain that I was referring to the part where @GPSnoopy wrote "Remove LongLength properties and old indexing operators." Thus the var i would produce Int64 i in the "sci" edition whereas Int32 i in the normal edition, because in the sci edition, System.Array.Length would return Int64, whereas in the normal version, it would be unchanged and continue to return Int32 as-is.

#define SCIENTIFIC_EDITION

double[] a = ...;
for (var i = 0; i < a.Length; i++)
{
    // i is Int64 here, not Int32.
}

// In the "Array.cs" file:
#if SCIENTIFIC_EDITION
using OrdinalType = System.Int64;
#else
using OrdinalType = System.Int32;
#endif

namespace System
{
    class Array
    {
        public OrdinalType Length { get; }
    }
}

verelpode commented 3 years ago

OH, oops, you're right, the C# compiler doesn't perform the var inference there in the manner that I thought it did. It's not as clever as I thought. Therefore I was mistaken. The Roslyn analyzer would have to prompt us to change it, not to var i, but rather to:

for (OrdinalType i = 0; i < a.Length; i++)

kasthack commented 3 years ago

There's another approach.

Mark .Length and .Count as obsolete.
Use the existing .LongLength / .LongCount properties everywhere(probably a compiler flag as this could affect performance?).
Throw on .Length / .Count access if the they exceed Int32.MaxValue.
Extend existing interfaces with 64-bit properties and make the default implementation to fall back to returning .Length / .Count.

Pro:

This allows for gradual migration to 64-bit collections without breaking changes.

Con:

API surface pollution.

tannergooding commented 3 years ago

LongLength and LongCount are not a good solution either. They are "too big" on 32-bit platforms and introduce a perf hit. Likewise, they are not compatible if (for whatever reason) 128-bit lengths exist in the future.

The ask isn't strictly "I want 64-bit arrays", I see it more as "I want to be able to use my memory to the fullest" in which case you want nint as it allows you to fully address memory and access it efficiently on any platform.

verelpode commented 3 years ago

Are people definitely proposing that Array.Length return nint in all cases regardless of whether the normal or big GC is being used?

For example, tablet apps (or many ASP.NET web-apps etc) should continue to run as x86-64 / AMD64 as they do now, but they shouldn't use the big GC with big arrays. Therefore if Array.Length returns nint, then tablet apps etc would see Array.Length returning a 64-bit integer regardless of the fact that they're running with a GC and CLR that doesn't support allocating any arrays exceeding int.MaxValue. Is that OK?

If the field internal int _size; inside List<T> is changed to nint along with the property List<T>.Count, then the field is 64-bit even when the runtime and GC don't support it. Is that OK?

It might or might not be better to use an OrdinalType instead of nint, in order to support this configuration for tablet apps, normal apps, etc:

Architecture = x86-64 / AMD64
Pointer size = 64-bit
nint = 64-bit
Ordinal size = 32-bit

verelpode commented 3 years ago

Would it make the issue easier or harder if an additional "architecture" would be added? For example, look at:

The enum System.Runtime.InteropServices.Architecture returned by System.Runtime.InteropServices.RuntimeInformation.ProcessArchitecture.
The enum System.Reflection.ProcessorArchitecture returned by System.Reflection.AssemblyName.ProcessorArchitecture.

Does the work become more or less if one more architecture would be added to those enums etc? I mean not only these enums; I mean also creating everything else that constitutes an additional architecture. This is no small task.

Currently the enum has a member Architecture.X64. What if an architecture named Architecture.X64Big is created?

Architecture	size of `nint` or an "OrdinalType"	ptr size
`Architecture.X86`	32	32
`Architecture.X64` or an "X64Small"	32	64
`Architecture.X64Big`	64	64

And then Array.Length and ICollection<T>.Count etc would return nint or an "OrdinalType".

This would mean no change to the current .NET, meaning when using the current Architecture.X64, the nint or "OrdinalType" returned by ICollection<T>.Count would be Int32 same as it is now. However if an app opts-in to the new architecture Architecture.X64Big, then the nint/ordinal returned by ICollection<T>.Count etc would be Int64. Overall in total on average, would this cause the total amount of work to be more or less than the alternative ideas?

huoyaoyuan commented 3 years ago

Would it make the issue easier or harder if an additional "architecture" would be added? For example, look at:

This will mak things worse. Again, the issue is that most lengths are encoded as Int32 in metadata, which is conventionally invariant size. nint is always the suitable size, it is just not used.

verelpode commented 3 years ago

@huoyaoyuan

nint is always the suitable size, it is just not used.

Always? That's controversial. For example, if 64-bit nint is used inside a Dictionary<int,int> for ordinals/indexes (hash buckets), then its RAM consumption would increase to 160% of its current consumption, also leading to increased runtime duration and processor L1/L2 cache line misses. Likewise for HashSet<T> (150-160%).

Combined with apps needing multiple instances of these collection objects, this is a significant performance degradation in exchange for zero benefit for most programs (normal programs that don't process big-data).

For example, currently the Visual Studio team is trying to achieve a new feature that is similar to "XAML Hot Reload" but for everything non-XAML. Hot reload of anything/everything in the program, not only XAML. Repeatedly change pieces of your C# source code, and then immediately, instantly, see the results of your changes in your app while your app is still running! This feature requires instant processing of a large quantity of objects/data. This kind of responsiveness becomes more difficult to achieve when, for example, every instance of Dictionary and HashSet consumes 160% without any benefit for any programs other than big-data.

Science and big-data are very important, but it's hard to justify a plan that causes unnecessary degradations/regressions for nearly everyone for nothing in return. I agree with what @GSPP wrote:

"This must be opt-in."

For example, 1162687 items in the Dictionary, TKey=Int32, TValue=Int32 causes:

Currently (32-bit ordinals): 23253740 bytes consumed.
With 64-bit nint: 37205984 bytes consumed (including required alignment/padding bytes).

verelpode commented 3 years ago

@tannergooding

However, if your data only needs to meet the unmanaged constraint (is not a reference and does not contain references), then you are likely better off bypassing the GC altogether and just using native allocations. The major downside is that Span also uses int instead of nint (which was released in C# 9) and so you can't "support" larger native allocations in the same API

That's really interesting because, you know what that means? That means we've all been talking about big Array when actually big Span<T> is more important in terms of delivering more bang for your buck, with less work and faster delivery.

This indicates a partial solution and a step forward that would substantially ease the pain of sci/big-data, yet requires far less work than the eventual future ultimate/full solution: For now for "phase 1", until more can be done in future, for now don't implement big arrays. Only change Span<T>.Length to nint, and leave the other things (Array, List<T> etc) unchanged.

Then sci/big-data developers would be able to choose to replace most of their usages of arrays with Span<T>:

Wherever a method in the app takes an array parameter, change it to Span.
Wherever a field stores an array reference in the field, change it to System.Memory<T> (assuming Memory<T>, Span<T>, and ReadOnlySpan<T> are all upgraded to nint).
Whenever new double[x] is invoked, replace this with a method that allocates memory and returns a Span<T> instead of T[]. This method would allocate either a normal array or unmanaged memory, depending on whether the max array length is exceeded. A finalizer would release the unmanaged memory automatically (also requires a small MemoryManager<T> subclass for use with Memory<T> but seems to be no problem).

That would go a long way towards easing the pain. Clearly there are also downsides here, clearly the discussed ultimate/full solutions are better but those could take years to deliver, whereas focusing on just the Span<T>.Length could deliver a big help much sooner.

verelpode commented 3 years ago

Here are the results of my current "semi-big-data" programming task:

The good news is that NETFW 5.0 allowed me to do what I needed to do, and better than NETFW 4.8.
The bad news is that Span<T> failed me and I couldn't use it to complete my task. Ideally I wanted to use Span<T> to write the entire class using verifiable safe code, but unfortunately Span<T> didn't permit this.

I'm describing my case as "semi-big-data" because my Array.Length doesn't exceed Int32.MaxValue - 1024. Nevertheless Span<T> threw:

System.OverflowException: The System.Span.Length property of the new System.Span would exceed System.Int32.MaxValue

Because of that OverflowException, I had to write it all using System.Runtime.CompilerServices.Unsafe instead of Span<T>. This is unfortunate but it works well anyway, and it's fast, and having 100% verifiable safe code was only an ideal wish not a mandatory requirement. So overall I'm happy and appreciative of the many improvements in NETFW 5.0. It's just a bit of a sore point that Span<T> should have been usable in this "semi-big-data" situation, but wasn't.

The reason why Span<T> throws that exception is:

My Array.Length is considerably less than Int32.MaxValue, but
the total size in bytes of the array can exceed 2 gigabytes, and
I needed to use System.Runtime.InteropServices.MemoryMarshal.AsBytes<T>(Span<T>) and the size in bytes doesn't fit in Span<T>.Length, hence the OverflowException.

So unfortunately I couldn't use Span<T> for this task. The risky-but-helpful System.Runtime.CompilerServices.Unsafe.AddByteOffset<T> came to rescue because, unlike Span, its parameter is System.IntPtr/nint.

An additional sore point was that MemoryMarshal.Read<T>(ReadOnlySpan<byte>) and MemoryMarshal.Write<T> don't have overloads with an offset parameter, thus they would've required me to constantly make new Spans by invoking Span<T>.Slice every time I want to invoke MemoryMarshal.Read/Write. Arguably this critique might be refutable by mentioning runtime optimizations that might effectively eliminate the Span<T>.Slice, but I still think it would be helpful to add these method overloads to MemoryMarshal:

public static T Read<T>(System.ReadOnlySpan<byte> source, nint offset) where T : unmanaged;
public static void Write<T>(System.Span<byte> destination, nint offset, T value) where T : unmanaged;

Similarly it would be convenient to add the equivalent to System.Runtime.CompilerServices.Unsafe, as a convenience not as a performance concern:

public static T Read<T>(ref byte source, nint offset)
{
    return Unsafe.As<byte, T>(ref Unsafe.AddByteOffset(ref source, offset));
}

public static void Write<T>(ref byte destination, nint offset, T value)
{
    Unsafe.As<byte, T>(ref Unsafe.AddByteOffset(ref destination, offset)) = value;
}

So even though I'm working with less data than real big-data people, even though I'm not exceeding the array limits, I still hit the limit of Span<T> anyway and wished that Span<T>.Length was nint instead of int.

Another case where Span<T> failed me is the inability to make a ReadOnlySpan<byte> that spans the entire contents of a data file that has been read fully into RAM, when the file exceeds 2GB. The reason for holding the entire file in RAM is the dramatic speed difference of all-in-RAM versus repeatedly issuing random-access reads of blocks from the disk.

huoyaoyuan commented 3 years ago

nint is always the suitable size, it is just not used.

@verelpode I should clarify that it's suitable for sizes, like size_t.

This would mean no change to the current .NET, meaning when using the current Architecture.X64, the nint or "OrdinalType" returned by ICollection.Count would be Int32 same as it is now. However if an app opts-in to the new architecture Architecture.X64Big, then the nint/ordinal returned by ICollection.Count etc would be Int64.

Then it won't get any of the ecosystem. There are code assuming lengths = int32 almost everywhere. Even if we can recompile BCL (will need a lot of work) for it, any third party libraries will be incompatible.

huoyaoyuan commented 3 years ago

To be clear, there isn't any technical difficulty to get a solution if we can recompile everything. The hardest thing is to bring the already compiled third party assemblies - the ecosystem - to be compatible.

jkotas commented 3 years ago

It is definitely helpful to brainstorm about how we would design the arrays or Span<T> if we had a time machine and got a chance to redesign them from scratch again.

As @huoyaoyuan and others pointed out, solutions that reset the ecosystem by changing signatures of the existing APIs are not going to fly well.

It may be acceptable to have a opt-in runtime mode that enables the large array or span support. One set of managed binaries would have to be able to support both the classic mode and the opt-in large array/span mode to avoid forking the ecosystem. The libraries would still require updates and recompilation to be compatible and that may be ok as long as forking of the ecosystem is avoided.

So it means that the realistic solutions in this space should be about adding new types or methods, not about changing signatures of the existing methods.

LargeArray<T> that is independent on existing arrays is an example of such pure additive option. What are the other interesting options that maintain this invariant (no existing APIs getting removed or changing signatures)?

Frassle commented 3 years ago

It's very sad that span wasn't made to use nint, or long when it was added given how recent it is. It's hard to imagine anything that would be useful to add without just being Span2 and trying to move the ecosystem to it.

LargeArray (either by making T unmanaged and using malloc'd memory, or by the deque trick to keep all the T[] below size limits) isn't actually that useful because there's no ecosystem around that, everything takes either T[] or Span.

A lot of the IO classes are already 64bit aware (Stream uses long for length and position, UnmanagedMemoryAccessor uses longs everywhere). But people don't write libraries around using IO classes as arrays, I think LargeArray would suffer the same way, people aren't going to be duplicating all their methods to take Span or LargeArray and so even if it was in the BCL you won't really get an ecosystem around it.

I still feel that just allowing arrays to be larger and having Length throw is the best stepping stone to accomplish this. There's clearly GC questions to be asked around such large arrays, but currently the .net ecosystem just isn't 64bit collection aware and we clearly can't just fix that in one step, its going to take years of bugs to get there (see C++ and the "stop using int, use size_t" pain they had). Adding this can't break anyone's current applications because they already having to allocate arrays smaller than Int32.MaxValue, and while libraries will likely break if passed large arrays that's only going to be from new uses that have allocated large arrays and we're just going to have to do the slow grind to move the ecosystem to nint.

As for the GC questions I don't think they're that bad. For example some of the points raised by @tannergooding in https://github.com/dotnet/runtime/issues/12221#issuecomment-802937379 are true, it would be prohibitively expensive to move large arrays, and if they have reference types it is much more for the GC to scan but a lot of that is on application developers. If you want to allocate billions of objects you check the performance of that and decide if its worth it, if you want to allocate billions of floats (much more likely in scientific computing) well the GC just needs some smarts for very large unmanaged object heap which results in something similar to malloc/free behavior. Anyone who expects the GC to be magical with GB or TB sized data sets needs there expectations tuning.

As a first step we could even side step the GC issues by just working on things like Span and Memory to start. Moving them to be nint based starts allowing the ecosystem to move towards native size collections (as more and more code is using Span instead of T[] anyway) and those who need very large unmanaged allocations and just use malloc and wrap it in Span. It doesn't solve this whole issue but it would be a very good start.

TLDR: 1) Add nint NLength (name very up for suggestions) to Span, ReadOnlySpan, Memory, and ReadOnlyMemory. Probably mark Length as Obsolete to nudge people to the new world. 2) Update compilers to use nint and NLength when they emit foreach over those types 3) Ecosystem can start moving to use nint with these types 4) Add nint Nlength to T[], List etc (this could be done in step 1, and just not return sizes above int32 to start) 5) Update the GC to understand very large arrays (Maybe the LOH is enough for this and app developers just need to be aware of the costs) 6) Allow allocation of nint sized arrays.

huoyaoyuan commented 3 years ago

Add nint Nlength to T[],

Today, the Length property of array has already been nint, but C# compiler will always insert an uncheck cast when accessing it. See sharplab

Notice the conv.i4 IL for even trivial access of array.

jkotas commented 3 years ago

Today, the Length property of array has already been nint

Nit: Array Length property is int. You are talking about ldlen IL instructions that is not the same thing. ldlen is defined as returning nuint in ECMA 335.

C# compiler will always insert an uncheck cast when accessing it.

So that means it is not very straightforward to change int Length property to throwing in the big-array mode.

Add nint NLength (name very up for suggestions)

I think NativeLength would be more in line with the established naming conventions.

Update compilers to use nint and NLength when they emit foreach over those types

It would actually result into a nice micro-optimization as side-effect. Consider:

static int Sum(int[] a)
{
    int s = 0;
    foreach (int e in a)
        s += e;
    return s;
}

It generates code like this today:


xor     eax,eax
xor     edx,edx
mov     r8d,dword ptr [rcx+8]
test    r8d,r8d
jle     repro!repro.Program.Sum(Int32[])+0x1c (00007fff`21565c5c)

movsxd  r9,edx <- This is unnecessary overhead
add     eax,dword ptr [rcx+r9*4+10h]
inc     edx
cmp     r8d,edx
jg      repro!repro.Program.Sum(Int32[])+0xd (00007fff`21565c4d)

ret

movsxd r9,edx is sign-extension of the index from int32 to native int. This sign extension would not be required if the index was native int in the first place. It would be nice for JIT to be able to perform this optimization for you, but the JIT is not capable of doing it today.

NativeLength property would aid with manual implementation of this optimization for now. And even in future, for the more complex cases that the JIT is unlikely to ever pick up. I think somebody talked to me about it or we had issue on this somewhere, but I am not able to find it.

Frassle commented 3 years ago

Today, the Length property of array has already been nint

Nit: Array Length property is int. You are talking about ldlen IL instructions that is not the same thing. ldlen is defined as returning nuint in ECMA 335.

C# compiler will always insert an uncheck cast when accessing it.

So that means it is not very straightforward to change int Length property to throwing in the big-array mode.

Ah yes I had forgot that detail. I think the first part of my idea is still sound, that is moving Span and friends to be nint length. Arrays themselves were always going to be the trickiest part of this.

We can't change how old code is compiled to IL but we can change the semantics of how that IL is executed. So maybe the JIT would just have to change ldlen to have an overflow check for int32, and add a new ldlen2 instruction for when Array.NativeLength is accessed which doesn't have the overflow check.

A crazier idea (but maybe also helps with solving the array variance) is biting the bullet and just adding a new Array2 type to the runtime. Same as Array but with native int lengths and no variance support. Might need some GC magic (I'm not totally sure with how the GC tracks them) but it might be possible to have implicit casts from Array to Array2 without having to copy all the memory, and explicit casts from Array2 to Array. Slowly the ecosystem moves to Array2, but the support for two way non-copy casts allows most code to deal with that migration.

GrabYourPitchforks commented 3 years ago

A crazier idea (but maybe also helps with solving the array variance) is biting the bullet and just adding a new Array2 type to the runtime.

There has been discussion on that in this thread. (You may have to expand the responses that GitHub collapsed mid-thread.) Generally speaking, I think it would be viable to pull off something like this in the runtime. Applications which deal with large data sets could use the native array / native span types. We could also enlighten enough of the SDK's build-in libraries to understand the new NativeArray or NativeSpan types, which means we could seed the ecosystem as part of their introduction.

It's not trivial to pull this off (there's quite a bit of GC goop we'd have to figure out), but assuming we could pull it off, it should end up having no ill effect on applications which don't know about these new types.

Frassle commented 3 years ago

There has been discussion on that in this thread.

Yes I saw some of it but ideas of a LargeArray type that was segmented, or used native memory seemed like no-gos to me because you can't interop them with existing Array based code. https://github.com/dotnet/runtime/issues/12221#issuecomment-537533235 is a good description of what could work for array (+ lets fix variance while we're at it, we can always add ReadOnlyArray for safe variance like what's done in ImmutableArray)

Applications which deal with large data sets could use the native array / native span types.

This point I'm not so sure about. Its looking like the Array type might have to be duplicated but I'm not sure that means we should duplicate all the other types. Consider that we probably want to apply this to the interfaces that arrays inherit from like ICollection and IList, and once you have large sized array you could have large sized List and Stack. For these types where we can change the Length/Count property I think it would be better to do that, than to duplicate the types.

Take an example of an API that currently looks like void DoIt(Span<T> a, Span<T> b). If you want to use that API with a large span allocation you either have the choice of seeing a runtime error when calling it, raising a bug and getting the implementation fixed to access NativeLength instead of Length. Or you get a compiler error, notice that a cast from LargeSpan to Span would throw (because you know you large spans are actually large), raise a bug and get an overloaded added void DoIt(LargeSpan<T> a, LargeSpan<T> b). For external libraries that's probably not to bad, they can generally be a bit more aggressive trimming old functions and could delete the Span overload after a few releases, but applying this to the BCL leads to a lot of overloads hanging around.

Tradeoffs to think about. I think I'd prefer the duplication off methods and properties that are int based when they need to be nint based, rather than duplicating every type that has one of those methods or properties and then duplicating every method or property that uses the original type.

GrabYourPitchforks commented 3 years ago

TBH I think many of these concerns about "but now you have to enlighten everything about LargeArray / LargeSpan!" are overblown.

This is not much different from the situation a few years ago where all we had were array-based overloads of various APIs, and then we went and added Span-based overloads as priority and time permitted. The world didn't end, the ecosystem didn't become bifurcated, and applications moved on to the new APIs only if they derived significant value from doing so. We didn't bother updating many of the old collection types or interfaces because it was a paradigm shift.

There's no reason we couldn't follow the same playbook here. Not every single aspect of the runtime and library ecosystem needs to move forward simultaneously. It's possible to deliver a minimum viable product including an exchange type and basic (common) API support, then continue to improve the ecosystem over future releases as priority dictates and time permits.

Joe4evr commented 3 years ago

Take an example of an API that currently looks like void DoIt(Span<T> a, Span<T> b). If you want to use that API with a large span allocation you either have the choice of seeing a runtime error when calling it, raising a bug and getting the implementation fixed to access NativeLength instead of Length.

I imagine LargeSpan<T> would have some mitigation API on it like Span<T> GetSubSpan(nint start, int length) (note length here is specifically not nint) precisely for consuming APIs that wouldn't (yet) know about the new stuff.

Frassle commented 3 years ago

So would one of the following proposals possibly get traction?

A) Add NativeLength

Background and Motivation

See https://github.com/dotnet/runtime/issues/12221 for background discussion on this API.

Proposed API

 namespace System
 {
    public readonly ref struct Span<T>
    {
-       private readonly int _length;
+       private readonly nint _length;

+       public Span(void* pointer, nint length);
+       public nint NativeLength { get; }
+       public ref T this[nint index] {get; }
+       public Span<T> Slice(nint start, nint length);

        public int Length
        {
            [NonVersionable]
-           get => _length;
+           get {
+               if (_length > Int32.MaxValue) throw new InvalidOperationException();
+               return (int)_length;
+           }
        }
    }
 }

Similar API changes would be made to ReadOnlySpan, Memory, and ReadOnlyMemory.

Usage Examples

This could be used with native memory allocations to get .net style views on large data sets. Once libraries are updated to use NativeLength rather than Length this would allow large data sets to make use of the normal .net ecosystem.

Risks

This incurs a bounds check on every access to Span, which as a high performance type is a unwanted negative cost. Its not clear from the type system if apis are safe to call with Spans that are longer than Int32.MaxValue, thus risking unclear runtime exceptions.

B) Add NativeSpan

Background and Motivation

See https://github.com/dotnet/runtime/issues/12221 for background discussion on this API.

Proposed API

namespace System
{
    public readonly ref struct NativeSpan<T>
    {
        public NativeSpan(void* pointer, nint length);
        public ref T this[nint index] { get; }
        public nint Length { get; }
        public NativeSpan<T> Slice(nint start);
        public NativeSpan<T> Slice(nint start, nint length);

        public static implicit operator NativeSpan<T>(Span<T> span);
        public static explicit operator Span<T>(NativeSpan<T> span);

        // Plus other methods that are present on Span<T>
    }
}

Similar class would be made for NativeReadOnlySpan, NativeMemory, and NativeReadOnlyMemory.

Usage Examples

This could be used with native memory allocations to get .net style views on large data sets. Once libraries are updated to use NativeSpan rather than Span this would allow large data sets to make use of the normal .net ecosystem.

Risks

This will cause noticeable api surface duplication as methods are slowly moved to use NativeSpan instead of Span. The BCL will have to maintain both overloads for a significant time. This is even more noticeable for methods/properties that currently return Span (for example MemoryExtensions.AsSpan() will also need MemoryExtensions.AsNativeSpan()).

Both

Either way thinking about this did trigger me to think if nint is actually the right type to use here? C++/rust would use an unsigned native int here (size_t in C++, usize in rust). Historically the BCL would not use unsigned types on such a major API surface because it's not CLS compliant, is that still a going concern? If we do stick with nint it does mean that on 32bit systems a Span would still be limited to 2GB despite most OS being able to allocate a larger buffer than that in modern 32bit system. Also there's always some groups looking into C# for kernel work and a Span for the entire address space could be nice.

GrabYourPitchforks commented 3 years ago

@Frassle Thanks for posting that. I think NativeSpan is more likely to get traction than Span.NativeLength, especially since as you called out Span is supposed to be a low-level, high-performance API. It wouldn't be acceptable for the Length property accessor to incur a check on every call. It also introduces confusion for consumers: if I'm calling an API that takes Span<T>, how am I supposed to know whether the API has been long-span enlightened? Presumably the docs would mention this, but do I now need to read the docs for every single API I call? That puts significant burden on callers.

For NativeSpan, we can improve the ecosystem piecemeal. I honestly think the set of NativeSpan overloads we add within the runtime and libraries would be fairly small and that it won't lead to overload explosion like some on this thread fear. We're talking overloading really low-level APIs like MemoryExtensions.IndexOf(this ReadOnlyNativeSpan<T>, ...), plus maybe some things on MemoryMarshal. You're not going to see APIs like Stream.Write(RONS<byte>), MemoryExtensions.ToUpperInvariant(RONS<char>), NativeSpan<T>.ToArray(), etc.

To all: If we had a NativeSpan<T> or a ReadOnlyNativeSpan<T>, what overloads would you want to see?

(Also, I'd prefer NativeSpan<T>.Length to be typed as nuint instead of nint, but that's just me. It has implications for the signature of MemoryExtensions.IndexOf and friends, but we can cross that bridge when we come to it.)

Edit: I'm not sold on NativeMemory or ReadOnlyNativeMemory yet. Those both have significant implications for the runtime, GC, and friends. They also have implications for the normal Memory<T> and ReadOnlyMemory<T> types. NativeSpan and ReadOnlyNativeSpan have much reduced impact, since their changes are limited to the exchange types themselves, whatever overloads we add, and some JIT optimizations.

jkotas commented 3 years ago

if I'm calling an API that takes Span, how am I supposed to know whether the API has been long-span enlightened?

This can be solvable using analyzers, e.g. add [NativeSpanEnligtened] attribute and analyzers that disallow calling Length in any [NativeSpanEnligtened] context.

GrabYourPitchforks commented 3 years ago

@jkotas That runs into the same problems as nullability. It introduces a marker attribute which serves as an API contract. Anything that performs code generation or dynamic dispatch would need to be updated to account for it, just as if they were running a light form of the Roslyn analyzer at runtime. Though to be honest I don't think we have a significant number of components which perform codegen over span-consuming methods.

It also doesn't solve the problem of somebody passing you (perhaps as a return value) a span whose length exceeds Int32.MaxValue, which means that your own code could encounter an exception in a place where you're not expecting such exceptions to occur. Could solve it by making this attribute viral and apply both to input and output / return parameters, just like nullability. And by crafting special compiler support for it. :)

jkotas commented 3 years ago

It also doesn't solve the problem of somebody passing you (perhaps as a return value) a span whose length exceeds

This part can be solved by having a long Span support as optional runtime mode. Span.Length either throws unconditionally or be slower and throw on overflow in this mode (ie there can be two variants of this mode).

It is similar to how we are dealing with linkability. We are marking APIs and patterns that are linker friendly. If your app uses a linker unfriendly APIs or patterns, it has unspecified behavior. You have to fix the code if you want to be sure that it works well.

I understand that the analyzer approach is not without issues, but I think it is worth having it on the table. If we went with the NativeSpan approach, we would have to introduce NativeSpan overloads for all APIs that take Span in the limit. It feels like a too high price to pay.

GrabYourPitchforks commented 3 years ago

If we went with the NativeSpan approach, we would have to introduce NativeSpan overloads for all APIs that take Span in the limit. It feels like a too high price to pay.

That concern is understandable, but I don't think it's likely to come to fruition in practice. I spent a few minutes looking through System.IO, System.Runtime, and other ref assemblies for public APIs which take [ReadOnly]Span<T>, and I just don't see the value in adding NativeSpan overloads of most of them. I don't see us adding NativeSpan overloads for any of our TryParse or TryFormat methods, for instance, and those alone seemingly account for over half of all spanified APIs. To be honest, I'd be surprised if we end up enlightening more than ~5% of all span APIs. To me, that seems like an acceptable price to pay to avoid complicating normal Span<T>.

All of that said, I think network-style scenarios are going to be more common than big data scenarios. And network-style scenarios may involve lots of data being sharded across multiple buffers via ReadOnlySequence<T>. We still don't enjoy widespread ecosystem support for that type across our spanified API surface. And the fact that we haven't created many overloads for this type tells me that we in practice aren't going to create many overloads for native span.

If we were to identify which APIs we wanted to create ReadOnlySequence<T> overloads for, that would allow us to focus on both the network and the big data problem, as you can always wrap a ReadOnlySequence<T> around an arbitrarily large native memory buffer.

Joe4evr commented 3 years ago

If we went with the NativeSpan approach, we would have to introduce NativeSpan overloads for all APIs that take Span in the limit.

As I mentioned previously, that could be covered by giving NativeSpan a mitigation API that creates (up-to)-Int32.MaxValue-sized slices. I believe that can keep most existing APIs none-the-wiser and only a scant few may need new overloads because they can be slightly more optimized.

GPSnoopy commented 3 years ago

A few colleagues and I had some thoughts about how we could come up with a reasonable proposal along the lines of nullables (which seemed to be the favoured approach in our small group) and that could still be implemented as a PR by one or two developers. Since there have been many different ideas flying around in the comments, I've put this into a separate draft document.

Latest version of the proposal can be found at https://github.com/GPSnoopy/csharplang/blob/64bit-array-proposal/proposals/csharp-10.0/64bit-array-and-span.md

Expand arrays, strings and spans to native-int lengths

Summary

Expand arrays (e.g. float[]), string, Span<T> and ReadOnlySpan<T> to have 64-bit lengths on 64-bit platforms (also known as native lengths).

Motivation

Currently in C#, arrays (and the other aforementioned types) are limited to storing and addressing up to 2^31 (roughly 2 billion) elements. This limits the amount of data that can be contiguously stored in memory and forces the user to spend time and effort in implementing custom non-standard alternatives like native memory allocation or jagged arrays. Other languages such as C++ or Python do not suffer from this limitation.

With every passing years, due to the ever-continuing increases in RAM capacity, this limitation is becoming evident to an ever-increasing number of users. For example, machine learning frameworks often deal with large amount of data and dispatch their implementation to native libraries that expect a contiguous memory layout.

At the type of writing this (2021-03-23), the latest high-end smartphones have 8GB-12GB of RAM while the latest desktop CPUs can have up to 128GB of RAM. Yet when using float[], C# can only allocate up to 8GB of contiguous memory.

Proposal

The proposed solution is to change the signature of arrays, string, Span<T> and ReadOnlySpan<T> to have an nint Length property and an nint indexing operator (similar to what C++ does with ssize_t), respectively superseding and replacing the int Length and int indexing operator.

As this would break existing C# application compilation, an opt-in mechanism similar to what is done with C# 8.0 nullables is proposed. By default, assemblies are compiled with legacy lengths but new assemblies (or projects that have been ported the new native lengths) can opt-in to be compiled with native lengths support.

Future Improvements

This proposal limits itself to arrays, string, Span<T> and ReadOnlySpan<T>. The next logical step is to tackle all the standard containers such as List<T>, Dictionary<T>, etc.

Language Impacts

When an assembly is compiled with native lengths, the change of type for lengths and indexing from int to nint means that existing code will need to be ported to properly support this new feature. Specifically the following constructs will need updating.

for (int i = 0; i < array.Length; ++i) { /* ... */ } // If length is greather than what's representable with int, this will loop forever.

for (var i = 0; i < array.Length; ++i) { /* ... */ } // Same as above. A lot of code is written like this rather than using an explicit int type.

The correct version should be the following.

for (nint i = 0; i < array.Length; ++i) { /* .... */ }

Unfortunately, due to implicit conversion the C# 9.0 compiler does not complain when comparing an int to an nint. It is therefore proposed that the compiler warns when an implicit conversion occurs from int to nint for the following operations: a < b, a <= b, a > b, a >= b and a != b.

Runtime Impacts

In order to accommodate these changes, various parts of the dotnet 64-bit runtime need to be modified. Using C# nint and C++ ssize_t, these modifications are in theory backward compatible with the existing 32-bit runtime.

The proposal assumes that these are always enabled, irrespective of whether any assembly is marked as legacy lengths or native lengths.

The C++ runtime implementation of .NET arrays and strings need to use ssize_t to store their length (TODO I believe this is already the case, verify it).
The .NET runtime implementation of Span<T> and ReadOnlySpan<T> need to internally store their length as nint.
The .NET runtime implementation of System.Array needs to be updated to reflect the same type changes.
The JIT needs to be aware of native lengths when generating code that accesses arrays and strings, for both the Length property and the indexing operator.
The C# compiler and JIT compiler need to generate foreach code that is nint aware for arrays, string, Span<T> and ReadOnlySpan<T>.
The GC implementation currently assumes that containers and arrays have up to 2^31 elements and outgoing references. This limitation needs to be lifted.

Boundaries Between Assemblies

Calling Legacy Lengths Assembly From Native Lengths Assembly

Without any further change, a native length assembly passing an array (or any other types covered by this proposal) with a large length (i.e. greater than 2^31) to a legacy lengths assembly would result into an undefined behaviour. The most likely outcome would be the JIT truncating the Length property to its lowest 32-bit, causing the callee assembly to only loop on a subset of the given array.

The proposed solution is to follow C# 8.0 nullables compiler guards and warn/error when such a boundary is crossed in an incompatible way. As with nullables, the user can override this warning/error (TODO exact syntax TBD, can we reuse nullables exclamation mark? Or do we stick to nullable-like preprocessor directives? IMHO the latter is probably enough).

In practice, applications rarely need to pass large amount of data to their entire source code and dependencies domain. We foresee that most applications will only need to make a small part of the source code and dependencies native lengths compatible. Thus enabling a smoother transition to a native lengths-only C# future version.

Calling Native Lengths Assembly From Legacy Lengths Assembly

The proposed aforementioned runtime changes mean this should work as expected.

GrabYourPitchforks commented 3 years ago

@GPSnoopy That proposal seems to capture things fairly well. A few additions / changes I'd recommend:

Currently strings store their backing length as uint32, not as size_t. Changing the backing length to be native word sized will increase the size of all string instances by 4 bytes on 64-bit platforms. It also may necessitate changes to debugger tooling like sos, which I believe currently hard-code the layout of string instances.
Do you see changing [ReadOnly]Memory<T> at the same time as array, string, span?

Additionally, do you have a proposal for what we do with publicly-exposed System.* APIs? Consider the following code sample.

byte[] bytes = GetSomeBytes();
byte[] moreBytes = GetMoreBytes();
byte[] concat = bytes.Concat(moreBytes).ToArray(); // System.Linq

Presumably runtime-provided methods like Enumerable.ToArray<T>(this IEnumerable<T> @this) should be callable both from a legacy-length assembly and a native-length assembly. But the immediate caller would likely expect the runtime behavior to depend on the compilation mode of the caller. Otherwise you could end up with a native-length caller invoking this API and getting an improper OutOfMemoryException or a legacy-length caller invoking this API and getting back a too-large array. Do we need to make a modification to your proposal to account for this scenario?

GSPP commented 3 years ago

I honestly think the set of NativeSpan overloads we add within the runtime and libraries would be fairly small

I believe that is true. Currently, most collections in practical applications contain far less than 2 billion elements. Why is that? It's not because of technology limitations. It's because IT applications usually model real world concepts and the real world very rarely has more than 2 billion items for any type of object. Even Facebook has only 1 billion users.

I find it remarkable that the need for more than 2 billion elements is so exceedingly rare.

If .NET adds support for large arrays, then the resulting collections and algorithms should be considered to be rarely used, not mainstream. Large arrays are an enabling feature so that more advanced libraries can be built on top. Large arrays should not be seen as a centerpiece of the platform permeating everything.

GPSnoopy commented 3 years ago

@GrabYourPitchforks Thank you for the feedback, you raise valid points. I've tried to address them, or at least highlight the gaps in https://github.com/GPSnoopy/csharplang/commit/830a3d8d3898b9a26066eee09e3493f9691d2edf (and a small addendum in https://github.com/GPSnoopy/csharplang/commit/804df79a564e3515cda862a4afea5f513fde4d5a).

I've checked the CLR source code. You are correct. I've highlighted that the proposal is okay with extending the memory used by strings. But marked in such a way that this is not a clear cut decision, and therefore might need to be revisited. Also added the impact to debuggers and other external tools.
[ReadOnly]Memory<T> is on oversight on my part. I've added them to the proposal.
Good point about System.Linq. I realise that the proposal only considered passing arrays & co as function arguments, but not when returning them. For the moment I suggest a runtime check when crossing boundaries in that case.

GrabYourPitchforks commented 3 years ago

It is therefore proposed that the JIT compiler adds an automatic runtime length check whenever a method ... crosses the boundary from a native lengths assembly to a legacy lengths assembly.

This would not be sufficient. The problem is indirection, such as through interfaces and delegate dispatch. It's not always possible to know what behavioral qualities the caller / target of an indirection possesses. (It's one of the reasons .NET Framework's CAS had so many holes in it. Indirection was a favorite way of bypassing the system.)

In order to account for this, you'd need to insert the check at entry to every single legacy method, and on the return back to any legacy caller. It cannot be done only at legacy / native boundaries.

Edit: It also didn't address the problem I mentioned with LINQ and other array-returning methods. The caller might expect the runtime method implementation to have a specific behavior, such as succeeding or throwing, independent of any checks that occur when boundaries are crossed. For example, MemoryMarshal.AsBytes is guaranteed to throw OverflowException if the resulting span would have more than 2bn elements. Array.CreateInstance is guaranteed to throw ArgumentOutOfRangeException or OutOfMemoryException. And so on. Making all of these methods native-aware and having the check performed at the return would change the caller-observable behavior when these methods are invoked by legacy applications.

Given the above paragraph, it might be appropriate to say "this new runtime capability changes contracts for existing APIs, even if no user assembly in the application is native-aware." It's fine to call out stuff like that in the doc, since it contributes to the trade-off decisions we need to make.

Edit x2: Oh, and fields and ref arguments! Those have visible side-effects even in the middle of method execution. (Example: the ref is to a field of an object, and that field is visible to other threads running concurrently.) Detecting whether a native assembly is about to set a field in use by a legacy assembly isn't something the JIT can know at method entry or method exit. This limitation should also be captured, as it influences whether the Array.Length property could be a simple narrowing cast or whether we'd need to insert "if out of bounds" logic into it.

tannergooding commented 3 years ago

While I agree that most of the ecosystem likely won't receive large data as inputs, I think its important that the ecosystem be encouraged to move to simply handle it.

When you are talking about large data, there isn't really a difference between naively handling 500 million elements vs 2 billion elements vs 1 trillion or more elements. I'd expect most code paths are likely optimized for a few hundred to a few thousand inputs at best, with a few special cases being optimized for millions of inputs.

I'd think you'd then have a few cases:

Trivial for loops which are easy to analyze and recognize. These can simply be updated to use .NativeLength rather than .Length and to ensure x cmp y is nint cmp nint
foreach loops likely need little to no change unless you are explicitly tracking a count
LINQ methods where you use things like .Count or .ToList. These are already potentially problematic because nothing requires enumerable to be less than 2 billion or even finite in size. LINQ already handles this be having a checked context around the fallback counter.

There are also two distinct cases for moving to NativeLength. One of which is on managed types (like T[] or List<T>) where it is largely a potential perf optimization (the JIT actually needs changes here for this to work as expected) and future proofing and the other of which is Span<T> (and possibly Memory<T>) where there is an immediate use case with native data.

Without the GC revving, there is actually no change to user code allocating managed arrays. Just a recommendation to switch to NativeLength for forward compatibility. Therefore, there can be no perf hit "today" for users working with Array.Length until said changes happen.

While with Span<T> as soon as we expose it (assuming we also expose a constructor that takes nint), users can start creating incompatible versions. This means Span<T> paths need to update or may end up having a checked downcast from nint inserted into their code.

I would imagine any checking we do could be public int Length => checked((int)(NativeLength)) and/or inserted by the JIT as part of bounds checking and which just like bounds checks, could be elided in many many common scenarios.

GPSnoopy commented 3 years ago

@GrabYourPitchforks Added your concerns verbatim to the proposal document.

Personally I would like to see if we can avoid adding such a check absolutely everywhere (e.g. what about non-virtual method calls?). Or find a better alternative.

GrabYourPitchforks commented 3 years ago

Personally I would like to see if we can avoid adding such a check absolutely everywhere (e.g. what about non-virtual method calls?).

Reflection and delegate dispatch may involve indirection to non-virtual methods. And it's not always possible for the runtime to reason about the caller or the target of such indirection. (See also my earlier comments on CAS and why it was so broken.)

tfenise commented 3 years ago

It's not just about compatibility, that is, no problem would arise if we were to redesign everything from scratch. To use large arrays, indices also need to be extended to native size (64 bits), and that introduces problems.

verelpode has already pointed out:

Always? That's controversial. For example, if 64-bit nint is used inside a Dictionary<int,int> for ordinals/indexes (hash buckets), then its RAM consumption would increase to 160% of its current consumption, also leading to increased runtime duration and processor L1/L2 cache line misses. Likewise for HashSet<T> (150-160%).

Combined with apps needing multiple instances of these collection objects, this is a significant performance degradation in exchange for zero benefit for most programs (normal programs that don't process big-data).

How do we deal with this possible performance degradation in such collection types if ordinary arrays were allowed to be very large?

Just extend the indices to native size (64 bits). Ignore everyday programmers' complaints. This doesn't sound very good.
Do not extend. Keep those collections as it is, and state that they do not work with too many elements. The problem is that, all codes that use those collections may not work if given large array inputs. For example, suppose I as a library developer write:
```
public long[] Distinct(long[] array)
{
HashSet<long> set = new();
foreach (long x in array)
{
    set.Add(x);
}
return set.ToArray();
}
```
This does not work if array contains too many distinct elements. To make this code large-array-aware, I need to be aware that HashSet<> does not support too many element and correct it, probably implementing my own large-array-aware MyHashSet<>. To make sure the corrected code is really large-array-aware, I need to write tests to test it against large array inputs. Such large long[] is at least of 16GB and I need a high-end computer with 32+GB of RAM to do such tests. Probably I would feel that large arrays are so rare, and feel lazy and simply drop large array support.
Make the collections smart in a way that they use 32-bit indices for not so many elements, and automatically switch to 64-bit indices if there are so many elements. This may still degrade performance, and adds to complexity.

Even if the BCL took the third approach, third-party collection developers may still feel lazy and choose other approaches instead. They either use int or nint for indices. Arrays being indexed through nint and the inability to index arrays with int without warnings may be enough to make them choose nint and thus avoid the second approach, which looks the worst, but not always enough. There are non-array-based collections such as binary trees, and a binary tree implementation may happen to use int for indices and counts while getting no warning, which means the second and the worst approach.

All of these "support-large-arrays-or-not" troubles are not present currently, because int is the single most "natural" integral type in .NET/C#, so codes probably work given not-so-big-relative-to-int inputs. If nint became the recommended way to index arrays, then there would be two most "natural" integral types in .NET/C#: int and nint. There would not be such guarantee that codes probably work given not-so-big-relative-to-nint inputs.

Personally I'd vote for introducing new types like LargeArray<T> or NativeSpan<T>, rather than enabling the existing T[] to be large. As I said, enabling the existing T[] to be large introduces two most "natural" integral types in .NET/C# and there would not be a guarantee that codes probably work given not-so-big-relative-to-nint inputs. Keeping a single array type does not make a difference if there would be codes that support large arrays and codes that do not. Introducing new types makes it explicit on the api surface whether the library supports large arrays. What's more, large arrays are not common, and may need special (e.g. vectorized, memory-saving) measures to deal with efficiently anyway.

hxmcn commented 2 years ago

My suggestion: the next version of .net (.net 7, or .net 8 for LTS) should be 64 bit only, and make everything to 64 bit as possible as it can. And, there's no property named LongSize / LongLenght, just Size/Length but all Int64 in data type. Sure, that's breaking change, but from a long-term perspective，such change is acceptable sand intuitive.

Previous Next