dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.28k stars 4.73k forks source link

[API Proposal]: Add explicit conversion operator from `Rune` to `char` #91508

Open Neme12 opened 1 year ago

Neme12 commented 1 year ago

Background and motivation

The other day, I needed to convert a Rune to char when it fits within a single char. I tried using an explicit conversion, expecting that to work, but to my surprise, such a conversion doesn't exist. I later found out that this can be achieved by (char)rune.Value, but this isn't obvious, and I think the explicit conversion should exist given that a conversion from char to Rune exists as well. If it works in one direction, I think it should work in the other direction as well, so this is what I'm proposing.

API Proposal

Given that conversions from int and uint exist as well, I added conversions back to those to the proposal as well for symmetry. These ones could actually be implicit because the conversion will always succeed, but I opted to make them explicit as I think making them implicit would be undesirable and could lead to bugs, just like the existing implicit conversion from char to int often causes bugs, so it's good to make the developer be explicit.

 namespace System.Text;

 public readonly partial struct Rune
 {
     // Existing conversions:
     public static explicit operator Rune(char ch);
     public static explicit operator Rune(int value);
     [CLSCompliant(false)]
     public static explicit operator Rune(uint value);

     // Proposed conversions:
+    public static explicit operator char(Rune rune);
+    public static explicit operator int(Rune rune);
+    [CLSCompliant(false)]
+    public static explicit operator uint(Rune rune);
 }

API Usage

// Fancy the value
var ch = (char)rune;

Alternative Designs

No response

Risks

No response

alexrp commented 1 year ago

Do you envision the Rune to char operator throwing if the Unicode scalar is out of range for char?

Neme12 commented 1 year ago

@alexrp Yes, of course. That's how explicit casting usually works - if the value can't be converted to that type, an InvalidCastException is thrown. Just like the existing cast from char to Rune throws if the char is a surrogate or invalid Unicode scalar.

alexrp commented 1 year ago

Sounds fine. I can see the usefulness of the Rune to char operator with those semantics.

I'm personally less convinced about the int/uint operators. But I also happen to think that the int/uint to Rune operators probably shouldn't have been added in the first place, and instead only the Rune(int) constructor should have existed... so maybe that's just my bias.

Neme12 commented 1 year ago

I don't really need those, but I added them to the proposal for symmetry since the opposite conversions exist as well. I think conversions should generally be symmetrical whenever possible - if you can convert A to B, you should be able to convert B back to A. (This is also probably why I tried casting to char and expected that to Just Work™, because I knew that the opposite conversion exists.)

DaZombieKiller commented 1 year ago

I don't feel like the int/uint conversion operators are a good idea, nothing about an int/uint explicitly says "this is a character/codepoint" unlike char. Doing that conversion via rune.Value is unambiguous and is something you can already do today.

Neme12 commented 1 year ago

I don't really need those, but I added them to the proposal for symmetry since the opposite conversions exist as well. I think conversions should generally be symmetrical whenever possible - if you can convert A to B, you should be able to convert B back to A. (This is also probably why I tried casting to char and expected that to Just Work™, because I knew that the opposite conversion exists.)

(That said, I don't feel strongly about that. I'm mainly interested in the char one).

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-runtime See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation The other day, I needed to convert a `Rune` to `char` when it fits within a single `char`. I tried using an explicit conversion, expecting that to work, but to my surprise, such a conversion doesn't exist. I later found out that this can be achieved by `(char)rune.Value`, but this isn't obvious, and I think the explicit conversion should exist given that a conversion from `char` to `Rune` exists as well. If it works in one direction, I think it should work in the other direction as well, so this is what I'm proposing. ### API Proposal Given that conversions from `int` and `uint` exist as well, I added conversions back to those to the proposal as well for symmetry. These ones could actually be implicit because the conversion will always succeed, but I opted to make them explicit as I think making them implicit would be undesirable and could lead to bugs, just like the existing implicit conversion from `char` to `int` often causes bugs, so it's good to make the developer be explicit. ```diff namespace System.Text; public readonly partial struct Rune { // Existing conversions: public static explicit operator Rune(char ch); public static explicit operator Rune(int value); [CLSCompliant(false)] public static explicit operator Rune(uint value); // Proposed conversions: + public static explicit operator char(Rune rune); + public static explicit operator int(Rune rune); + [CLSCompliant(false)] + public static explicit operator uint(Rune rune); } ``` ### API Usage ```csharp // Fancy the value var ch = (char)rune; ``` ### Alternative Designs _No response_ ### Risks _No response_
Author: Neme12
Assignees: -
Labels: `api-suggestion`, `area-System.Runtime`, `untriaged`
Milestone: -