dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.27k stars 4.73k forks source link

Add snake_case support for System.Text.Json #782

Closed hez2010 closed 2 years ago

hez2010 commented 5 years ago

Contents below were taken from @khellang, thanks!

API Proposal

Add additional casing support for System.Text.Json, based on this comment.

namespace System.Text.Json;

public partial class JsonNamingPolicy
{
    public static JsonNamingPolicy CamelCase { get; }
+    public static JsonNamingPolicy SnakeLowerCase { get; }
+    public static JsonNamingPolicy SnakeUpperCase { get; }
+    public static JsonNamingPolicy KebabLowerCase { get; }
+    public static JsonNamingPolicy KebabUpperCase { get; }
}

public enum JsonKnownNamingPolicy
{
    Unspecified = 0,
    CamelCase = 1,
+    SnakeLowerCase = 2,
+    SnakeUpperCase = 3,
+    KebabLowerCase = 4,
+    KebabUpperCase = 5,
}

Behavior Proposal

I propose the same behavior as Newtonsoft.Json, just like the existing camel case behavior. The implementation is here and the tests are here.

Other Comments

I think snake case naming is pretty common, especially in the Ruby world, and should be supported out of the box. GitHub's API is probably the most popular example using this naming convention off the top of my head.

Implements

See: dotnet/corefx#40003

ericstj commented 3 years ago

Adding a couple others with text expertise: @GrabYourPitchforks @tarekgh @eiriktsarpalis

the project uses Unicode text segmentation which while allows to handle mostly any case (no space languages like Japanese might be an issue even while it works according to Annex 29) it looks to heavy. My thoughts on it is to drop dependency on Unicode segmentation library and ignore any non-letter or non-digit characters. These characters should be treated as word

That sounds reasonable to me, but I'm not an expert. Perhaps when properties are expressed in no-space languages it's undesirable to insert _s even when applying this policy, so that would make it reasonable to not find word boundaries in no-space languages. I'm not sure how many other cases are covered by that Unicode text segmentation (tr29) that would be interesting in the constrained value space of CLI-metadata-identifiers (defined in ECMA-335 1.8.5.1 as https://www.unicode.org/reports/tr15/tr15-18.html#Programming%20Language%20Identifiers) + JSON property names. Perhaps after applying all constraints then only a smaller set of things needs to be considered?

@YohDeadfall are you still looking for feedback on all the questions raised here? https://github.com/dotnet/corefx/pull/41354#issuecomment-554958476

GrabYourPitchforks commented 3 years ago

Generally speaking, I think the lowercase / uppercase transitions are an appropriate insertion point. There are some exceptions like proper nouns ("iPhone" -> "iphone", not "i_phone") and cases which have single-letter words like 'I' and 'a' ("DoILikeIceCream" -> "do_i_like_ice_cream"). But those should be rare enough and can be overridden with an attribute at the property.

YohDeadfall commented 3 years ago

@YohDeadfall are you still looking for feedback on all the questions raised here?

The implementation covers Unicode related part, but not last three questions, but I tried my best to align results with the existing approaches.

There are some exceptions like proper nouns ("iPhone" -> "iphone", not "i_phone") and cases which have single-letter words like 'I' and 'a' ("DoILikeIceCream" -> "do_i_like_ice_cream").

iPhone is really an exception, but DoILikeIceCream will be translated exactly as you mentioned, as do_i_like_ice_cream. If there is an lowercase letter following a sequence of uppercase letters, then the last uppercase letter belongs to a word with the lowercase letter. That's how XMLHttpRequest gets translated to xml_http_request.

GrabYourPitchforks commented 3 years ago

Fine, then we'll do an example from French. :) "IlYAUnChat" should go to "il_y_a_un_chat", not "il_ya_un_chat".

ForNeVeR commented 3 years ago

I'd suggest doing what JSON.NET does, and that's it. In my experience, its behavior is good enough for most practical purposes.

YohDeadfall commented 3 years ago

Since I feel burned out and not active lately, I think it would be better to remove my assignment to allow others to take this and finish what's done. Currently there's a prototype I mentioned earlier from which Unicode segmentation should be removed after checking that the set of characters used for names in CLR isn't handled by segmentation. There're tests already too, so it shouldn't be so hard to work on it.

@huoyaoyuan, previously you tried to work on this issue. If you're still interested please take it.

layomia commented 3 years ago

Triage - we didn't get around to this in .NET 6.0, but there remains an easy workaround of implementing a custom naming policy with the desired behavior. Moving to .NET 7.

John0King commented 3 years ago

a suggestion for snake_case : ABC_DEF => abc_def aBcDeF_gHIJ => abcdef_ghij Abc_Def => abc_def _Abc_Def => abc_def // not sure this is excepted Abc-Def => abc_def abc def => abc_def

basically , it stop to handle case naming if it contains one of the follow delimiter: _-<space> , and remove prefix of delimiter as possible.

and May I ask why this exists feature been moved to 7.0.0 (push 2 years time)?
and for question that how many user use snake case : There are still almost 40% user/company use snake_case json in China , For example: Alibaba, Tencent and large group of PHP web service

LukeTOBrien commented 3 years ago

I have come here after searching for a solution for my project.
I am also following the issue: https://github.com/dotnet/runtime/issues/29975

My issue is that the third party API that I am consuming uses mixed naming policy, some of the JSON properties are snake_case and some are camelCase.
Is there a way to map .NET property names to use a different policy? i.e: func<string, JsonNamingPolicy>

options = new JsonSerializerOptions
{
     PropertyNamingPolicy = (propertyName) => propertyName == nameOf(SnakeCaseProperty) ? JsonNamingPolicy.SnakeCase : JsonNamingPolicy.CamelCase,     
};
eiriktsarpalis commented 3 years ago

@LukeTOBrien it should be possible to achieve by inheriting the JsonNamingPolicy class. In pseudocode:

public override string ConvertString(string name) =>  propertyName == nameOf(SnakeCaseProperty) ? JsonNamingPolicy.SnakeCase.ConvertName(name) : JsonNamingPolicy.CamelCase.ConvertName(name);
benfoster commented 3 years ago

For the interim, since I got fed up of copying the provided implementation around, I've published the policy to https://www.nuget.org/packages/O9d.Json.Formatting.

FiniteReality commented 3 years ago

@layomia Any updates on this? I can rebase my branch onto main and open a new PR if that's ideal.

GiorgioG commented 3 years ago

Stop dragging your feet Microsoft. Support a well-established naming policy out of the box.

huoyaoyuan commented 3 years ago

It can simply throw on any questionable name to avoid future breaking change. At least we need support for most common scenarios (well-defined ASCII).

thomas-darling commented 2 years ago

Yeah, we really need this too, but for kebab case, i.e. hello-world.

The entire world does not follow the same naming conventions, and systems need to communicate, so this absolutely needs to be customizable - and quite frankly, I'm pretty shocked that something this basic isn't already supported.

Regarding the statements in https://github.com/dotnet/runtime/issues/782#issuecomment-553723786 about the lack of precedence for kebab case support:

It doesn't matter what was supported in Newtonsoft JSON - it matters what is needed in the real world. The fact that we were forced to waste time writing custom converters before, does not justify leaving out the support now. Implementing support for snake case, while telling developers to implement custom converters if they need kebab case, would in my opinion be quite disappointing - the difference is literally just which separator character to use.


Edit: Oops, sorry, came here from https://github.com/dotnet/runtime/issues/31619, and somehow missed that this issue is not just related to Enums. Using kebab case for property names would definitly be uncommon - but it's not that uncommon to see Enum values using it.

khellang commented 2 years ago

@YohDeadfall Are you up for this again? We really need to get this into .NET 7.

IMO, this should base its implementation on a common word-splitting algorithm that future conventions (not coveredy in this API proposal,) like kebab-case, as mentioned in https://github.com/dotnet/corefx/pull/41354#issuecomment-554965858, should use. There's also an opinion that the snake_case behavior should be compatible with Newtonsoft.Json (which I think is a mistake, given its bugs).

Anyway, that's the main question to get answered before an implementation can happen.

YohDeadfall commented 2 years ago

Hi, @khellang! I made Yoh.Text.Json.NamingPolicies a while ago as an research before burning out and it was published on NuGet, and it uses word segmentation as specified by Unicode and then splits each word into parts by analyzing character casing and kind as you can see here. The first part can be removed to improve performance since we need to support property names only. So if you're okay with the cases listed in the tests I can move the source code to .NET.

deeprobin commented 2 years ago

@khellang Are you okay with the test cases listed by @YohDeadfall?

khellang commented 2 years ago

Yeah, I think they match my expectations πŸ‘πŸ»

jeffhandley commented 2 years ago

I'm updating this feature's milestone to Future. We understand snake_case, kebab-case, and other common naming conventions are very important and that it's disappointing we don't have it implemented yet. But we want to set expectations appropriately that it is not likely to make it into .NET 7.

Please refer to #63762 to see our revised set of System.Text.Json work planned for .NET 7. We are also doing a lot of incremental refactoring during .NET 7 that makes it easier for us to add features like this one and ensure developers have a more consistent experience when using System.Text.Json #63918. The list of PRs in System.Text.Json illustrates the progress we've been making on that effort.

YohDeadfall commented 2 years ago

I update my project to exclude text segmentation from it, and existing tests pass. So I think soon I will try to push changes to System.Text.Json.

teo-tsirpanis commented 2 years ago

@dotnet/area-system-text-json, @terrajobst this API is marked as approved but I can't find an API review video, or an entry in dotnet/apireviews. Did a mistake happen?

YohDeadfall commented 2 years ago

Hey there, I just opened a PR #69613 adding new policies to the libraries, but to complete it I need another round of API proposal. As @sveinfid mentioned, kebab is widely used too and is worth to be included for better interopability. My implementation is simple, so it shares the same code for snake and kebab conventions, and in addition to that allows upper and lower casing.

namespace System.Text.Json
{
    public abstract class JsonNamingPolicy
    {
        protected JsonNamingPolicy() { }
        public static JsonNamingPolicy CamelCase { get { throw null; } }
+       public static JsonNamingPolicy SnakeLowerCase { get { throw null; } }
+       public static JsonNamingPolicy SnakeUpperCase { get { throw null; } }
+       public static JsonNamingPolicy KebabLowerCase { get { throw null; } }
+       public static JsonNamingPolicy KebabUpperCase { get { throw null; } }
        internal static JsonNamingPolicy Default { get { throw null; } }
        public abstract string ConvertName(string name);
    }

Names are chosen to explain what kind of separator is used and which casing, and to keep it symmetrical. While upper case conventions are much less popular they still exist, so included them too for the reason mentioned above. It puts the .NET serializer on the same line as serializers for other platforms, but shipped as a separate packages. Usually, all possibilities are shipped in seriailizers or name converting packages to make them full featured and help the user by giving a single dependency maintained by a single owner, so the same set of rules apply for policies.

In case when this is still too much for the reviewing team, it can be shortened to:

namespace System.Text.Json
{
    public abstract class JsonNamingPolicy
    {
        protected JsonNamingPolicy() { }
        public static JsonNamingPolicy CamelCase { get { throw null; } }
+       public static JsonNamingPolicy SnakeCase { get { throw null; } }
+       public static JsonNamingPolicy KebabCase { get { throw null; } }
        internal static JsonNamingPolicy Default { get { throw null; } }
        public abstract string ConvertName(string name);
    }

Then snake and kebab policies will be lower case, and in the future upper case with Shouting in name can be added.

/cc @terrajobst @jeffhandley

terrajobst commented 2 years ago

@teo-tsirpanis

@dotnet/area-system-text-json, @terrajobst this API is marked as approved but I can't find an API review video, or an entry in dotnet/apireviews. Did a mistake happen?

Likely. There are rare cases where we approve APIs offline via email, in a pinch. Those don't go via GitHub/YouTube and hence aren't picked up by my tool that archives them in dotnet/apireviews.

Any particular questions you're trying to answer?

terrajobst commented 2 years ago

@dotnet/area-system-text-json based on the comments from @teo-tsirpanis and the latest change request from @YohDeadfall I have pushed the API back to api-ready-for-review.

teo-tsirpanis commented 2 years ago

Any particular questions you're trying to answer?

My question was what was the approved API shape was Well, forget it, I found https://github.com/dotnet/runtime/issues/782#issuecomment-532341171 (😳), but it might still be useful to re-review it for the additional casings.

JsonKnownNamingPolicy might also need to be updated.

terrajobst commented 2 years ago

Well, forget it, I found #782 (comment) (😳),

Ah yes, that's why didn't find the notes -- we approved it but didn't apply the label. Oops. But yeah, we'll take another quick look tomorrow.

OnamChilwan commented 2 years ago

Is there any update on this, when can we expect this to be added? I am targeting .NET 6 and can only see CamelCase as an option..

YohDeadfall commented 2 years ago

@OnamChilwan, you can currently use a package mentioned in https://github.com/dotnet/runtime/issues/782#issuecomment-1018535660. The pull request I made uses it's code with style changes only.

@terrajobst, till what date pull request can be merged and appear in .NET 7? If it's too close and no time left for an approval of public API changes we can go with option 2. It allows to trim the PR only to SnakeCase and add other options later, but with Shouting (less preferred naming in my opinion).

J0rgeSerran0 commented 2 years ago

@OnamChilwan you can use this code too I did 3 years ago, is in some threads about this issue, and I shared it for the people that is looking for a solution with this

https://github.com/J0rgeSerran0/JsonNamingPolicy

wardboumans commented 2 years ago

Incredible this is still not in .NET7. This is why people still stick to Newtonsoft.

GiorgioG commented 2 years ago

This is pathetic. I can't advocate for System.Text.Json anymore if it's going to remain a worse option than Newtonsoft.

georgiosd commented 2 years ago

Really @wardboumans @GiorgioG? Using a custom naming policy that already exists out there is that despicable that you'd drop the library? Wow.

bartonjs commented 2 years ago

Video

namespace System.Text.Json;

public partial class JsonNamingPolicy
{
    public static JsonNamingPolicy SnakeCaseLower { get; }
    public static JsonNamingPolicy SnakeCaseUpper { get; }
    public static JsonNamingPolicy KebabCaseLower { get; }
    public static JsonNamingPolicy KebabCaseUpper { get; }
}

public enum JsonKnownNamingPolicy
{
    SnakeCaseLower = 2,
    SnakeCaseUpper = 3,
    KebabCaseLower = 4,
    KebabCaseUpper = 5,
}
YohDeadfall commented 2 years ago

Cool, thanks! Then I will align the pull request according to it and address requests.

GiorgioG commented 2 years ago

@georgiosd Yes really. I can't tell you how many times I've advocated to use System.Text.Json and then have to defend these basic kind of omissions. JSON.NET has had this since at least 2016: https://github.com/JamesNK/Newtonsoft.Json/commits/master/Src/Newtonsoft.Json/Serialization/SnakeCaseNamingStrategy.cs

georgiosd commented 2 years ago

Well maybe they should not have optimized the framework and include a snake case naming strategy.

Really, if you're deciding on the basis of having to write a naming policy implementation (that is what, 5 lines?) and not the overall improvements perhaps you have bigger problems in the organization.

J0rgeSerran0 commented 2 years ago

Totally agree with @georgiosd

I thought that my implementation would be up and running for 6 months (more or less)? 3 years later... I see the end of this discussion is near

Speaking of naming and in my case, the classes JsonKebabCaseNamingPolicy and JsonSnakeCaseNamingPolicy (KebabCasexxx) (SnakeCasexxx)

As you can see, what Georgios says is a fact