Closed Korporal closed 5 years ago
Though it's probably too simplistic for your needs. That is, it's either exactly public fixed string _name[32];
and that simply generates a hidden int length
field and the 32 char fixed array.
Or it's something far more complicated, that allows you to customize the character type and the length field type (or the lack of it). And the more complicated it gets, the less likely it is for it to happen.
How is your struct better
Simple. I can do it today with no difficulty. I don't need anything else because it's already available and pretty easy (both codewise and conceptually)
I mean, your question isn't that relevant. It's like asking 'why are methods better than this hyperspecialized method-like construct I'm proposing for an exceptionally niche scenario'.
You keep flipping the burden around. It's not my responsibility to explain why the status quo better. The onus is on you to defend why the language needs to change here for your specific needs.
Namely we can't create flexible types that encapsulate the boiler plate stuff unless we create a family of them.
I don't understnd this. Why is that the case? You have ReadOnlySpan<char>
and Span<char>
. Why can't you write helpers that work with those? I mean, lots of those helpers already exist as extensions. What are you missing? You're woefully under-specifying what you're actually looking for here.
i.e. you're saying "C# needs to provide something to help here" and when alternatives are offered you say they are insufficient, but you haven't explained why they're insufficient. But you then use that as a continued argument why the language needs to do something. But i don't even know what it is you want from the language because the deficiencies in what's there with Span/ROS aren't explained.
@CyrusNajmabadi
I don't understand this.
That's painfully clear to us.
@Korporal Who exactly is the us? You seem to be the only person asking this. CyrusNajmabadi is one of the main contributors to Roslyn, and was a member on the LDC. He has an extremely thorough understanding of both language and compiler design. If you want your proposal accepted, you are going to need to convince people like him it's a good idea. Insults don't help your case. Perhaps instead of lashing out, you could take same of the feedback he offers into consideration. He has often disagreed with me in the short time I've participated in this repo and in Roslyn, and he is almost always right.
@YairHalberstadt
Mikedn's replies to me make it clear that he understands the problem. Cyrus genuinely seems no to, this is not an insult, he himself said he doesn't understand and I agree with that.
Incidentally, I've dealt with these kinds of issues in C# for over a decade and I'm no lightweight myself, I've also developed compilers. In fact I was the first to report this C# compiler bug a month ago, precisely because of high performance C# work.
@CyrusNajmabadi - Alright I will try yet again to explain to you the problem I am describing.
How is your struct better
Simple. I can do it today with no difficulty. I don't need anything else because it's already available and pretty easy (both codewise and conceptually)
Yours and I mine are both something we can "do today" so that is hardly "better" Cyrus. Now "need" is subjective and no doubt a common theme when discussing programming languages, since its subjective we have no formal defintion.
Now here is my struct followed by your proposed struct:
// This is single contiguous block of bytes and contains no reference types.
public struct LoginMessage
{
AString_32 UserName;
AString_16 Password;
AString_8 OtherStuff;
long MoreOtherStuff;
DateTime SomeDate;
}
LoginMessage msg = new LoginMessage();
msg.UserName = "Charlie"; // current implementation is null-terminated text
msg.OtherStuff = "Other";
byte[] bytes = RuntimeSupport.Serialize(ref msg); // often less than a microsecond on an i7-3960
and your proposal:
public unsafe ref struct LoginMessage2
{
public fixed char _userName[32];
public fixed char _password[16];
public fixed char _otherStuff[8];
public long MoreOtherStuff;
public DateTime SomeDate;
public ReadOnlySpan<char> UserName()
{
fixed (LoginMessage2* c = &this)
{
return CreateSpan(c->_userName, 32);
}
}
public ReadOnlySpan<char> Password()
{
fixed (LoginMessage2* c = &this)
{
return CreateSpan(c->_password, 16);
}
}
public ReadOnlySpan<char> OtherStuff()
{
fixed (LoginMessage2* c = &this)
{
return CreateSpan(c->_otherStuff, 8);
}
}
private static ReadOnlySpan<char> CreateSpan(char* pointer, int charCount)
=> new ReadOnlySpan<char>(pointer, charCount * 2);
}
If you think that LoginMessage2
is "better" than LoginMessage
then we're at an impasse and I cannot force you to adjust your view.
Yours requires the developer of LoginMessage2
to write a set of properties and this number increases as the number of (buffer) string fields increases. It also expose Spans rather than strings making the manipulation of a LoginMessage2
all the more verbose.
Yours requires the developer to ensure that the 32
or the 16
(buffer sizes) is repeated correctly in both the buffer declaration and the property that manipulates the buffer mine does not.
Your code would break if the developer altered a buffer declaration length but forgot to alter the property too, mine has no such shortcoming.
As is clear from what I've said so far my LoginMessage
works fine, it runs - "today" - which is why I'm puzzled you would use "do it today" as some form of differentiator between what we do "today" and what you can write "today", it isn't.
I mean, your question isn't that relevant. It's like asking 'why are methods better than this hyperspecialized method-like construct I'm proposing for an exceptionally niche scenario'.
Yet I said no such thing Cyrus.
You keep flipping the burden around. It's not my responsibility to explain why the status quo better. The onus is on you to defend why the language needs to change here for your specific needs.
I'm suggesting that formal serious consideration be given to enabling the creation of code like LoginMessage without the need for me to create a large set of types (AString_32 etc). The simplicity of LoginMessage should be crystal clear and enabling this at a language level is what I'm discussing, as I said @mikedn clearly understands what I'm discussing so perhaps you should read some of his replies.
I really cannot help you understand any further and I have no idea why you cannot understand my position here. I refuse to repeat myself any further and if that means I earn your disfavor and the issue gets closed - so be it.
Thank you.
@Korporal What you can't do though, is use your AString_n types in methods that accept a string. And you would need huge code duplication to get a method to work for all AString_n types.
However the ecosystem already supports Span
@Korporal What you can't do though, is use your AString_n types in methods that accept a string. And you would need huge code duplication to get a method to work for all AString_n types.
However the ecosystem already supports Span in a lot of places. That is the advantage.
@YairHalberstadt
Thanks, if the language (or CLR) cannot be changed to provide this then that's fine - I am simply seeking to examine alternatives. If we can't add this to the language then fine our current strategy of a generating a family of AString_XX types works well but is not ideal, it certainly offers far more than Cyrus's propsed approach - IMHO.
Can you think of anyway the language could be changed to support this, excluding duplicating every method that accepts a string to accept all sizes of AString_n types?
And the CLR is not going to be changed to support this. Changes to the CLR API only ever occur when there is an overwhelming benefit to do so, and the .Net Framework API looks like it's not going to be updated at all.
Can you think of anyway the language could be changed to support this, excluding duplicating every method that accepts a string to accept all sizes of AString_n types?
@YairHalberstadt - That's a great question, I had hoped there'd be suggestions from the gurus here but clearly this is not straightforward. I will post some ideas with more detail for you guys to consider/critique.
Thx
I think the only way to so would be using ReadOnlySpan and/or ReadOnlyMemory.
But once you are doing that I believe your string types could be generated once using CodeGen, and you're sorted. No need to add a language feature to do CodeGen for you, unless it's a seriously common use case.
@YairHalberstadt @CyrusNajmabadi @mikedn
One idea is to leverage the upcoming support for interfaces with default implementations. Then we could write:
public struct UserName : IValueTypeString<UserName>
{
private fixed char text[32];
}
where we have
public interface IValueTypeString<T> where T : unmanaged
{
// Stuff to get at and manipulate the "text" field in "this" instance.
// Also need to get at the length of "text" too.
// Ideally include static members so we can cache details from reflection.
public string Text
{
get {...}
set {...}
}
}
I'm unfamiliar with the new interface type's rule so this may not even work. But if it did then this would be a step forward because a developer could create an inline string quite easily, eg. our AString_32 would become:
public struct AString_32 : IValueTypeString<AString_32>
{
private fixed char text[32];
}
Although the developer does need to define the type it is very easy for them to do so, the underlying interface would do most of the manipulation/conversion in a general purpose way.
That would require boxing the struct every time a method is called on it
@YairHalberstadt
What about some variant of the String
type then?
I can envisage a type - like String
(call it VString
for now) - in which the type is a value type that contains a fixed buffer along with an actual instance of (slightly modified) String
in which the string's buffer pointer is the address of the fixed buffer rather than some block allocated form the managed heap...
In principle all the data would be inline in the declaring outer struct...
Basically this amounts to an ability to allocate a String
object and its text buffer inline - in the structs memory block - rather than the managed heap.
These are just thoughts and no doubt bad!
Whats wrong with this as a code-generated API?
using System;
public unsafe struct AString_32
{
public fixed char chars[32];
public ReadOnlySpan<Char> AsSpan()
{
fixed (char* c = chars)
{
return new ReadOnlySpan<Char>(c, 64);
}
}
}
@YairHalberstadt
Whats wrong with this as a code-generated API?
using System; public unsafe struct AString_32 { public fixed char chars[32]; public ReadOnlySpan<Char> AsSpan() { fixed (char* c = chars) { return new ReadOnlySpan<Char>(c, 64); } } }
Nothing wrong, but its less powerful than what we generate already:
public interface INativeString
{
string ToString();
}
public unsafe struct AString_8 : INativeString, IComparable
{
public const int MaxLength = 8;
public fixed Byte buffer[9];
public AString_8(String InitialText)
{
fixed (Byte * p = buffer) {p[0] = 0;}
if (InitialText == null)
return;
Text = InitialText;
}
public override string ToString()
{
fixed (Byte* p = buffer) return (StringWrapper.ANSIPtrToString(p, sizeof(AString_8)));
}
private string Text
{
set{fixed (Byte* p = buffer) StringWrapper.StringToANSIPtr(value, p, sizeof(AString_8));}
get{return(ToString());}
}
public static implicit operator AString_8(string SourceText)
{
return new AString_8(SourceText);
}
public static implicit operator string (AString_8 SourceText)
{
return(SourceText.ToString());
}
public int Length
{
get{return(Text.Length);}
}
int IComparable.CompareTo(object obj)
{
return(Text.CompareTo(obj));
}
}
Looking at String it's pretty complex, being able to leverage this logic or clone it it some way so the instance and its buffer are both allocated inline (within a struct's field block) would be interesting.
Here's one place where the buffer is accessed - the String code may be largely agnostic to where the buffer actually is.
I can envisage a type - like String (call it VString for now) - in which the type is a value type that contains a fixed buffer along with an actual instance of (slightly modified) String in which the string's buffer pointer is the address of the fixed buffer rather than some block allocated form the managed heap...
What will happen when you do something like this:
string M()
{
var vString = new VString("HelloWorld");
return vString.String;
}
Then you would have a pointer to invalid memory in your string
Essentially this is impossible without an ownership model.
What you're suggesting is doable in C++, and idiomatic in Rust. It is however impossible in C#.
Your current code generated API requires boxing the struct, allocating a new string, and copying over the chars into the new string every time you want to call a string method on it. Using a ReadOnlySpan solves that problem.
@YairHalberstadt
I can envisage a type - like String (call it VString for now) - in which the type is a value type that contains a fixed buffer along with an actual instance of (slightly modified) String in which the string's buffer pointer is the address of the fixed buffer rather than some block allocated form the managed heap...
What will happen when you do something like this:
string M() { var vString = new VString("HelloWorld"); return vString.String; }
Then you would have a pointer to invalid memory in your string
Essentially this is impossible without an ownership model.
What you're suggesting is doable in C++, and idiomatic in Rust. It is however impossible in C#.
We could impose a rule similar to that used for fixed buffers - only valid within a struct...
Your current code generated API requires boxing the struct, allocating a new string, and copying over the chars into the new string every time you want to call a string method on it. Using a ReadOnlySpan solves that problem.
Yes the code is dated however and I think could be improved by using the recently enhanced ref
support but I'd have to dive in to get more on that.
Anyway the main goal is to have the raw data inline - that's what enables very fast serialization, the overheads of setting getting the string is secondary.
For example we can write a stream of messages to a disk file very rapidly (and read from a file) because the serialization support includes length and type data. The runtime cost of getting at this or that string property isn't a big concern.
We can (for example) get at message 124,236 in a file and deserialize it very rapidly indeed.
@YairHalberstadt
What I find interesting (and this is not a criticism of anyone, the team or the language) is that something that seems on the surface straightforward actually presents such big challenge.
We could impose a rule similar to that used for fixed buffers - only valid within a struct...
So how would you ever use it? You can't pass it into a method which accepts a string, as maybe the method stores the atring.
So if you have a codegened API that works for you, what exactly do your need from the language?
@YairHalberstadt
So if you have a codegened API that works for you, what exactly do your need from the language?
Simply because the pre-generated code cannot include every conceivable buffer capacity, we gen AString_8
, AString_16
, AString_24
up to something AString_10240
with in between sizes unavailable.
This doesn't kill us but I wanted to explore (with the experts) possible options for making this a first class language feature, if this is truly very challenging and costly then that's fine but I am not the best judge - you guys are.
My frustration with Cyrus is that he didn't seem to know what I was trying to explain and that was becoming an impasse.
If all you want is some codegened types which someone else is responsible for maintaining, then why don't you suggest they add them in CoreFX?
As far as I can see your current proposal has two parts.
A) provide a shorthand syntax for declaring these fixed size strings (string(32)
instead of string_32
). Not going to happen - no upside to this.
B) make a string_32
usable as a normal string. This is impossible given the programming model of the CLR. The best you can do is use ReadOnlySpan, but that doesn't require any changes from the language.
So what exactly are you asking for?
If all you want is some codegened types which someone else is responsible for maintaining, then why don't you suggest they add them in CoreFX?
As far as I can see your current proposal has two parts.
A) provide a shorthand syntax for declaring these fixed size strings (
string(32)
instead ofstring_32
). Not going to happen - no upside to this.B) make a
string_32
usable as a normal string. This is impossible given the programming model of the CLR. The best you can do is use ReadOnlySpan, but that doesn't require any changes from the language.So what exactly are you asking for?
@YairHalberstadt - The starting problem statement is asking if it would be possible for C# to support a mutable, inline, fixed capacity, variable length string "type" so that we can create pure value type structs that contain text values as well as primitive values.
Currently pure (as in "unmanaged" generic constraint) structs can only be composed of primitive types or other structs composed of primitive types none of which have any text/string like capabilities.
Recognizing that inline fixed buffers are already supported I wanted to see if that support could be enhanced or built upon these as a possible means of doing this. Being able to assign these from and to a conventional string is the primary goal.
The answer is no.
That is fundamentally not how the .Net programming model works.
@YairHalberstadt - What about some additional operators then, for example tofixed
and tostring
:
public struct SomeMessage
{
private fixed char username[32];
public string Username
{
get {return tostring(username);}
set { tofixed(value,username);}
}
}
These operators being confined to working with fixed buffers? This would be better overall than having to generate the code we do, despite the fact the developer must define the property its very easy to do - with some kind of "operator" like this.
Note that we can't write (e.g. static) helper methods like this now because getting the capacity of an arbitrary fixed buffer is very hard to do, unless we jump through hoops (as I show in a different thread).
Being able to assign these from and to a conventional string is the primary goal.
This is impossible without CLR support, and that's unlikely to happen as System.String
can be passed around arbitrarily and doing such with stack space is inherently dangerous, hence the strict rules that C# has around ref
locals/returns. As it stands System.String
is always heap allocated*. Your current APIs don't avoid these allocations or their costs, they just defer them. And that I think would greatly impact what you consider your deserialization performance if you're not also taking into account the cost of negotiating the string properties of those structs.
* I want to say that I've seen hacks that would allow you to treat stack space as a managed heap object, but you'd have to allocate that memory to match what the reference type expects. For System.String
that would be a length and a pointer to the actual string data, so you'd be forced to rewrite the buffer to match that format with the pointer pointing to a location in the buffer. You wouldn't be able to deserialize any blob of bytes as-is.
Why not just write a function toString and toFixed?
The general consensus among C# language wonks, is that C# has too many operators to start off with. An operator just adds complexity to the language with very little benefit. Especially for such a rare scenario as yours.
@YairHalberstadt - see recent edit:
Note that we can't write (e.g. static) helper methods like this now because getting the capacity of an arbitrary fixed buffer is very hard to do, unless we jump through hoops (as I show in a different thread).
Then the size of operator for fixed size buffers is the relevant addition to the language, not these operators.
Besides, you could currently just cache the the the length and pass that in. The effort of doing so is not worth a language feature
@HaloFour
Being able to assign these from and to a conventional string is the primary goal.
This is impossible without CLR support, and that's unlikely to happen as
System.String
can be passed around arbitrarily and doing such with stack space is inherently dangerous, hence the strict rules that C# has aroundref
locals/returns. As it standsSystem.String
is always heap allocated*. Your current APIs don't avoid these allocations or their costs, they just defer them. And that I think would greatly impact what you consider your deserialization performance if you're not also taking into account the cost of negotiating the string properties of those structs.
We're not too concerned about the conversion costs, the alternative is a different form of serialization where String is fully supported in our message types. But that immediately becomes a far greater cost than what we do now (we compared this) and prevents us from passing pointers to these structs around, this is another point (and why we did some of this) is that we can create structs that contain text fields yet we can get their address - not possible when struct contains reference types.
A key cost in high performance system like trading systems and so on is needlessly moving data, the less data you move and the faster you can move it the better. Particularly when you make heavy use of IPC as we do.
- I want to say that I've seen hacks that would allow you to treat stack space as a managed heap object, but you'd have to allocate that memory to match what the reference type expects. For
System.String
that would be a length and a pointer to the actual string data, so you'd be forced to rewrite the buffer to match that format with the pointer pointing to a location in the buffer. You wouldn't be able to deserialize any blob of bytes as-is.
@YairHalberstadt
Then the size of operator for fixed size buffers is the relevant addition to the language, not these operators.
Yes this is probably a better request.
Besides, you could currently just cache the the the length and pass that in. The effort of doing so is not worth a language feature
I'm inclined to agree but the caching incurs a runtime cost (even after being cached to a dictionary) all to get a simple integer constant that was known at compile time. The more types and buffer sizes one has the greater that cost becomes too as the dictionary grows.
Getting the physical size of a fixed buffer (which is always wholly composed of 'n' fixed size primitive types) should I argue, not require user code, caches etc and the associated cost - this is a compile time constant don't forget.
How complex would it be to enable sizeof
to accept an identifier that is a fixed buffer declaration which simply returns n * sizeof(buffer_type) - a compile time constant?
public interface IFixedBuffer
{
int FixedBufferLength { get; }
}
public static FixedBufferExtensions
{
public static ToString<T>(this T buffer) where T : IFixedBuffer
{
var length = buffer.FixedBufferLength;
...
}
}
using System;
public unsafe struct AString_32 : IFixedSizeBuffer
{
public fixed char chars[32];
public ReadOnlySpan<Char> AsSpan()
{
fixed (char* c = chars)
{
return new ReadOnlySpan<Char>(c, 64);
}
}
public int FixedBufferLength => 32;
}
@Korporal
I think the main problem with your arguments is that you're asking for a language feature that would specifically benefit your codebase. I don't think that's going to happen, not without demonstrating how that feature would benefit many other codebases.
Specifically:
Anyway the main goal is to have the raw data inline - that's what enables very fast serialization, the overheads of setting getting the string is secondary.
I would like to see some evidence for that. It seems to me that you're not eliminating costs, you're just moving them around. That can be beneficial in some cases (e.g. when you're working with a single property on a large type), but are those cases widespread enough?
We can (for example) get at message 124,236 in a file and deserialize it very rapidly indeed.
That's an argument for fixed-width serialized format, but not necessarily fixed-width in-memory format. Also, a similar effect could be achieved by using a variable-width format along with an index, or even a database.
A key cost in high performance system like trading systems and so on is needlessly moving data, the less data you move and the faster you can move it the better.
That's what confuses me about your approach: you are needlessly moving data, when compared with simple string
fields:
Span<char>
instead.)@svick
@Korporal
I think the main problem with your arguments is that you're asking for a language feature that would specifically benefit your codebase. I don't think that's going to happen, not without demonstrating how that feature would benefit many other codebases.
It seems that you're correct here, also from what others say even wide appeal features stand only a small chance of getting included.
Specifically:
Anyway the main goal is to have the raw data inline - that's what enables very fast serialization, the overheads of setting getting the string is secondary.
I would like to see some evidence for that. It seems to me that you're not eliminating costs, you're just moving them around. That can be beneficial in some cases (e.g. when you're working with a single property on a large type), but are those cases widespread enough?
Consider updating say an option price, we can do it pretty much like this:
Option * option_ptr = datastore.GetItem<Option>(key); // can be updated soon to use new "ref" support.
option_ptr->bid_price = new_price;
This is a tiny cost (including the GetItem()) and enables updates to data at a very high rate and very low CPU cost, perhaps just 8 bytes change (e.g. a Decimal) despite the fact the Option might have many fields (including text fields like name, exchange etc).
The datastore incidentally is rather specialized and proprietary and local to the machine running the update operations, we can write to the store like this for example:
Option some_new_option = ...;
datastore.Write(ref some_new_option);
Because the code (a bit dated now but we can convert a ref to a ptr and vice versa with support code) can serialize very rapidly (using what I'm calling "memcpy" for ease of discussion) this too is very fast and low CPU.
We can (for example) get at message 124,236 in a file and deserialize it very rapidly indeed.
That's an argument for fixed-width serialized format, but not necessarily fixed-width in-memory format. Also, a similar effect could be achieved by using a variable-width format along with an index, or even a database.
As soon as the format begins to deviate from its in-memory layout you begin to incur significant costs. Nothing comes close to a single "memcpy" (e.g. CopyBlock). We can do this and have "strings" because of the AString_XX
stuff we have.
Furthermore because the data is stored in an identical structure to its managed memory layout we can use managed code (via pointers but we could use ref
more now since its been extended) to update the data because the layout is identical.
A key cost in high performance system like trading systems and so on is needlessly moving data, the less data you move and the faster you can move it the better.
That's what confuses me about your approach: you are needlessly moving data, when compared with simple
string
fields:
Not really, most of the work is updates and most of it to non-string fields.
- When you write a property, you copy the whole string, instead of just a pointer.
- When you read a property, you always allocate the string (which includes a copy), instead of allocating it only once at deserialization. (And you could probably do even better if you used
Span<char>
instead.)- When you copy the struct, you copy all the strings, instead of just few pointers.
This is true but as I've said earlier we don't update the "string" stuff much at all, these may be part of a lookup key or data that's used when reports are pulled for example. But 85% of the work is perhaps updating primitive numeric fields and 15% perhaps writing new items both of which operations are very fast.
Bear in mind that the datastore is part of the update service's (a Windows service) address space but not part of the AppDomain
, this is a specialized datastore technology (with much of it written in C
and Win32
as a native API) and without knowing that some of what I've said in this thread may not appear to make huge sense.
@Korporal
Check this proposal, which would add int parameter(s) to generics: #749
If that proposal gets implemented, you could do something like this:
struct ValueString<CH, const int SZ> {
fixed CH chars[SZ]; // fixed size inline array with SZ elements of type CH
// misc functions, properties, operators, ...
}
ValueString<char, 16> MyStringU16; // fixed size string-like value type with 16 Unicode chars
ValueString<byte, 64> MyStringA64; // fixed size string-like value type with 64 Ansi/Ascii chars
To reduce code bloat, the functions of ValueString could be implemented in an inner private empty (field-less) static class / struct. These inner helper functions would take a Span<> (which contains size and address of the fixed array) as parameter. The functions of the outer ValueString struct would be simple (and therefore maybe inline-able) wrappers around the inner work functions ...
Note, that proposal 749 is not limited to chars, bytes, strings, one-dimensional arrays, fixed arrays ...
@Korporal
Check this proposal, which would add int parameter(s) to generics: #749
If that proposal gets implemented, you could do something like this:
struct ValueString<CH, const int SZ> { fixed CH chars[SZ]; // fixed size inline array with SZ elements of type CH // misc functions, properties, operators, ... } ValueString<char, 16> MyStringU16; // fixed size string-like value type with 16 Unicode chars ValueString<byte, 64> MyStringA64; // fixed size string-like value type with 64 Ansi/Ascii chars
To reduce code bloat, the functions of ValueString could be implemented in an inner private empty (field-less) static class / struct. These inner helper functions would take a Span<> (which contains size and address of the fixed array) as parameter. The functions of the outer ValueString struct would be simple (and therefore maybe inline-able) wrappers around the inner work functions ...
Note, that proposal 749 is not limited to chars, bytes, strings, one-dimensional arrays, fixed arrays ...
@MillKaDe - Good lord, how did I miss that (I think someone else mentioned it and I glossed over it - inexcusable).
Your are absolutely right, that is exactly what's called for. I think this would work for me, very glad you mentioned this!
Thanks
If you think that LoginMessage2 is "better" than LoginMessage then we're at an impasse and I cannot force you to adjust your view.
I definitely think it's better. It's something you can do today. It uses the well-supported and understood 'Span/ReadOnlySpan' types. It's really simple (though does require some boilerplate in a few places). It will interoperate with teh rest of the high-perf, low-overhead, side of C#/.net.
Creating something new for this niche case seems pretty objectively worse. It would take years to get it. Would likely need an entirely new way of working with it. Would have to have a design around how it could work in the ref/span world, etc. etc.
What I find interesting (and this is not a criticism of anyone, the team or the language) is that something that seems on the surface straightforward actually presents such big challenge.
You're proposing something that wants to introduce a very different programing model than hte one that C# has had since 1.0, while also interoperating seamlessly with 20+ years of existing APIs. That's non-trivial.
it's equivalent to me coming to rust and asking it to have a totally different ownership model than what it has today. Or going to C++ and wanting lexical scoping to work entirely differently. It may be 'something that seems on the surface straightforward', but only is that way because it can ignore the deep design decisions and history involved here.
@CyrusNajmabadi - All I can say in response to your most recent remarks is that it seems to me you've ultimately designed yourselves into a corner. If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.
I can see now why you've been so critical, it's not that what I asked for is some huge piece of functionality, it's because your design and model is too restrictive, too inflexible.
@Korporal
If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.
Indeed it does. It tells us that C# is a safe, garbage collected language without an ownership model.
You might as well say that if Prologue cannot support object oriented programming without herculean effort then that has to tell us something about how they've all designed it.
This is how the .Net programming model works. End of story. If you need to do something the programming model doesn't support, use a different language.
@YairHalberstadt @CyrusNajmabadi - These analogies don't really help nor do I regard them as valid to be frank. Creating a supposed analogy (make Prolog more OO or change the scoping rules in C++) and then discrediting the analogy is referred to as a strawman argument in philosophy and logic, it has no place in a serious technical discussion.
@Korporal
If inline string data types cannot be supported
Span<char> inline = stackalloc char[100];
And it seems that there may be interest in treating fixed buffers as spans, which eliminates some boilerplate as you can use them in an expanding ecosystem of APIs.
These analogies don't really help nor do I regard them as valid to be frank.
Every language is a tradeoff of different philosophical concerns. Languages that allow arbitrary stack allocation and reinterpretation are inherently much less safe than C#, especially if they don't have an ownership model. C# and the CLR never has to concern itself with whether or not the memory backing a string has gone out of scope. This is why the compiler is so strict when it comes to ref
locals/returns.
It is not a strawman argument. You're arguing that something which is easy in a language with a completely different programming model is difficult in C#. Hence C# is badly designed.
We're pointing at that this is an obviously nonsensical argument, and giving some examples of the sort of nonsense conclusions you would come to if you applied that argument.
@Korporal
If inline string data types cannot be supported
Span<char> inline = stackalloc char[100];
And it seems that there may be interest in treating fixed buffers as spans, which eliminates some boilerplate as you can use them in an expanding ecosystem of APIs.
These analogies don't really help nor do I regard them as valid to be frank.
Every language is a tradeoff of different philosophical concerns. Languages that allow arbitrary stack allocation and reinterpretation are inherently much less safe than C#, especially if they don't have an ownership model. C# and the CLR never has to concern itself with whether or not the memory backing a string has gone out of scope. This is why the compiler is so strict when it comes to
ref
locals/returns.
Clearly there is no prospect of getting what I sought and that's fine, if the experts see this as a huge challenge then I respect that. But I never asked for arbitrary stack allocation or reinterpretation! What I did seek was a value type mutable fixed capacity string like type which could be used in struct fields in much the same way as primitive types or fixed buffers.
Strings in C# are perceived as buffers with an (to all intents and purposes) unlimited capacity and for this reason cannot be stored inline as primitive types are. I'm proposing that consideration be given to introducing an additional string type which has a capacity declared at runtime, and thus a maximum possible length.
This then makes it possible to define classes or structs which contain strings yet have these string appear inline, within the datum's memory much as primitive types are.
This is a problem that came up in a sophisticated very high performance client server design in which we got huge benefits by being able to define fixed length messages that contained strings. In our case we simulated fixed capacity strings as properties that encapsulated fixed buffers (char or byte). This worked well but was messy because the language offers no way for us to 'pass' (at compile time) a length into a fixed buffer declaration, one must actually declare the fixed buffer explicitly with a constant.
As a result we created a huge family of types named like this: ANativeString_64 and UNativeString_128 (ansi and unicode variants) and so on, as I say this worked but was messy.
Each type was a pure struct (as in the new generic constraint 'unmanaged') so when used as member fields in other structs left that containing struct pure, giving us contiguous chunks of memory that contained strings.
As I say this worked very well but was messy under the hood and challenging to maintain.
So could we consider a new primitive type:
string(64) user_name;
for example?
Such strings could be declared locally resulting in a simple stack allocated chunk, or as members within classes/structs in which case they appear inline just like fixed buffers do...
(just to be clear I'm not seeking the capacity to be defined at runtime but at compile time, and I know my syntax won't work but wanted to convey the idea).