Closed hez2010 closed 2 years ago
Adding a couple others with text expertise: @GrabYourPitchforks @tarekgh @eiriktsarpalis
the project uses Unicode text segmentation which while allows to handle mostly any case (no space languages like Japanese might be an issue even while it works according to Annex 29) it looks to heavy. My thoughts on it is to drop dependency on Unicode segmentation library and ignore any non-letter or non-digit characters. These characters should be treated as word
That sounds reasonable to me, but I'm not an expert. Perhaps when properties are expressed in no-space languages it's undesirable to insert _
s even when applying this policy, so that would make it reasonable to not find word boundaries in no-space languages. I'm not sure how many other cases are covered by that Unicode text segmentation (tr29) that would be interesting in the constrained value space of CLI-metadata-identifiers (defined in ECMA-335 1.8.5.1 as https://www.unicode.org/reports/tr15/tr15-18.html#Programming%20Language%20Identifiers) + JSON property names. Perhaps after applying all constraints then only a smaller set of things needs to be considered?
@YohDeadfall are you still looking for feedback on all the questions raised here? https://github.com/dotnet/corefx/pull/41354#issuecomment-554958476
Generally speaking, I think the lowercase / uppercase transitions are an appropriate insertion point. There are some exceptions like proper nouns ("iPhone" -> "iphone", not "i_phone") and cases which have single-letter words like 'I' and 'a' ("DoILikeIceCream" -> "do_i_like_ice_cream"). But those should be rare enough and can be overridden with an attribute at the property.
@YohDeadfall are you still looking for feedback on all the questions raised here?
The implementation covers Unicode related part, but not last three questions, but I tried my best to align results with the existing approaches.
There are some exceptions like proper nouns ("iPhone" -> "iphone", not "i_phone") and cases which have single-letter words like 'I' and 'a' ("DoILikeIceCream" -> "do_i_like_ice_cream").
iPhone
is really an exception, but DoILikeIceCream
will be translated exactly as you mentioned, as do_i_like_ice_cream
. If there is an lowercase letter following a sequence of uppercase letters, then the last uppercase letter belongs to a word with the lowercase letter. That's how XMLHttpRequest
gets translated to xml_http_request
.
Fine, then we'll do an example from French. :) "IlYAUnChat" should go to "il_y_a_un_chat", not "il_ya_un_chat".
I'd suggest doing what JSON.NET does, and that's it. In my experience, its behavior is good enough for most practical purposes.
Since I feel burned out and not active lately, I think it would be better to remove my assignment to allow others to take this and finish what's done. Currently there's a prototype I mentioned earlier from which Unicode segmentation should be removed after checking that the set of characters used for names in CLR isn't handled by segmentation. There're tests already too, so it shouldn't be so hard to work on it.
@huoyaoyuan, previously you tried to work on this issue. If you're still interested please take it.
Triage - we didn't get around to this in .NET 6.0, but there remains an easy workaround of implementing a custom naming policy with the desired behavior. Moving to .NET 7.
a suggestion for snake_case :
ABC_DEF
=> abc_def
aBcDeF_gHIJ
=> abcdef_ghij
Abc_Def
=> abc_def
_Abc_Def
=> abc_def
// not sure this is excepted
Abc-Def
=> abc_def
abc def
=> abc_def
basically , it stop to handle case naming if it contains one of the follow delimiter: _-<space>
, and remove prefix of delimiter as possible.
and May I ask why this exists feature been moved to 7.0.0 (push 2 years time)?
and for question that how many user use snake case : There are still almost 40% user/company use snake_case json in China , For example: Alibaba, Tencent and large group of PHP web service
I have come here after searching for a solution for my project.
I am also following the issue: https://github.com/dotnet/runtime/issues/29975
My issue is that the third party API that I am consuming uses mixed naming policy, some of the JSON properties are snake_case
and some are camelCase
.
Is there a way to map .NET property names to use a different policy? i.e: func<string, JsonNamingPolicy>
options = new JsonSerializerOptions
{
PropertyNamingPolicy = (propertyName) => propertyName == nameOf(SnakeCaseProperty) ? JsonNamingPolicy.SnakeCase : JsonNamingPolicy.CamelCase,
};
@LukeTOBrien it should be possible to achieve by inheriting the JsonNamingPolicy class. In pseudocode:
public override string ConvertString(string name) => propertyName == nameOf(SnakeCaseProperty) ? JsonNamingPolicy.SnakeCase.ConvertName(name) : JsonNamingPolicy.CamelCase.ConvertName(name);
For the interim, since I got fed up of copying the provided implementation around, I've published the policy to https://www.nuget.org/packages/O9d.Json.Formatting.
@layomia Any updates on this? I can rebase my branch onto main
and open a new PR if that's ideal.
Stop dragging your feet Microsoft. Support a well-established naming policy out of the box.
It can simply throw on any questionable name to avoid future breaking change. At least we need support for most common scenarios (well-defined ASCII).
Yeah, we really need this too, but for kebab case, i.e. hello-world
.
The entire world does not follow the same naming conventions, and systems need to communicate, so this absolutely needs to be customizable - and quite frankly, I'm pretty shocked that something this basic isn't already supported.
Regarding the statements in https://github.com/dotnet/runtime/issues/782#issuecomment-553723786 about the lack of precedence for kebab case support:
It doesn't matter what was supported in Newtonsoft JSON - it matters what is needed in the real world. The fact that we were forced to waste time writing custom converters before, does not justify leaving out the support now. Implementing support for snake case, while telling developers to implement custom converters if they need kebab case, would in my opinion be quite disappointing - the difference is literally just which separator character to use.
Edit: Oops, sorry, came here from https://github.com/dotnet/runtime/issues/31619, and somehow missed that this issue is not just related to Enums. Using kebab case for property names would definitly be uncommon - but it's not that uncommon to see Enum values using it.
@YohDeadfall Are you up for this again? We really need to get this into .NET 7.
IMO, this should base its implementation on a common word-splitting algorithm that future conventions (not coveredy in this API proposal,) like kebab-case
, as mentioned in https://github.com/dotnet/corefx/pull/41354#issuecomment-554965858, should use. There's also an opinion that the snake_case
behavior should be compatible with Newtonsoft.Json (which I think is a mistake, given its bugs).
Anyway, that's the main question to get answered before an implementation can happen.
Hi, @khellang! I made Yoh.Text.Json.NamingPolicies a while ago as an research before burning out and it was published on NuGet, and it uses word segmentation as specified by Unicode and then splits each word into parts by analyzing character casing and kind as you can see here. The first part can be removed to improve performance since we need to support property names only. So if you're okay with the cases listed in the tests I can move the source code to .NET.
@khellang Are you okay with the test cases listed by @YohDeadfall?
Yeah, I think they match my expectations ππ»
I'm updating this feature's milestone to Future. We understand snake_case, kebab-case, and other common naming conventions are very important and that it's disappointing we don't have it implemented yet. But we want to set expectations appropriately that it is not likely to make it into .NET 7.
Please refer to #63762 to see our revised set of System.Text.Json work planned for .NET 7. We are also doing a lot of incremental refactoring during .NET 7 that makes it easier for us to add features like this one and ensure developers have a more consistent experience when using System.Text.Json #63918. The list of PRs in System.Text.Json illustrates the progress we've been making on that effort.
I update my project to exclude text segmentation from it, and existing tests pass. So I think soon I will try to push changes to System.Text.Json
.
@dotnet/area-system-text-json, @terrajobst this API is marked as approved but I can't find an API review video, or an entry in dotnet/apireviews
. Did a mistake happen?
Hey there, I just opened a PR #69613 adding new policies to the libraries, but to complete it I need another round of API proposal. As @sveinfid mentioned, kebab is widely used too and is worth to be included for better interopability. My implementation is simple, so it shares the same code for snake and kebab conventions, and in addition to that allows upper and lower casing.
namespace System.Text.Json
{
public abstract class JsonNamingPolicy
{
protected JsonNamingPolicy() { }
public static JsonNamingPolicy CamelCase { get { throw null; } }
+ public static JsonNamingPolicy SnakeLowerCase { get { throw null; } }
+ public static JsonNamingPolicy SnakeUpperCase { get { throw null; } }
+ public static JsonNamingPolicy KebabLowerCase { get { throw null; } }
+ public static JsonNamingPolicy KebabUpperCase { get { throw null; } }
internal static JsonNamingPolicy Default { get { throw null; } }
public abstract string ConvertName(string name);
}
Names are chosen to explain what kind of separator is used and which casing, and to keep it symmetrical. While upper case conventions are much less popular they still exist, so included them too for the reason mentioned above. It puts the .NET serializer on the same line as serializers for other platforms, but shipped as a separate packages. Usually, all possibilities are shipped in seriailizers or name converting packages to make them full featured and help the user by giving a single dependency maintained by a single owner, so the same set of rules apply for policies.
In case when this is still too much for the reviewing team, it can be shortened to:
namespace System.Text.Json
{
public abstract class JsonNamingPolicy
{
protected JsonNamingPolicy() { }
public static JsonNamingPolicy CamelCase { get { throw null; } }
+ public static JsonNamingPolicy SnakeCase { get { throw null; } }
+ public static JsonNamingPolicy KebabCase { get { throw null; } }
internal static JsonNamingPolicy Default { get { throw null; } }
public abstract string ConvertName(string name);
}
Then snake and kebab policies will be lower case, and in the future upper case with Shouting
in name can be added.
/cc @terrajobst @jeffhandley
@teo-tsirpanis
@dotnet/area-system-text-json, @terrajobst this API is marked as approved but I can't find an API review video, or an entry in
dotnet/apireviews
. Did a mistake happen?
Likely. There are rare cases where we approve APIs offline via email, in a pinch. Those don't go via GitHub/YouTube and hence aren't picked up by my tool that archives them in dotnet/apireviews.
Any particular questions you're trying to answer?
@dotnet/area-system-text-json based on the comments from @teo-tsirpanis and the latest change request from @YohDeadfall I have pushed the API back to api-ready-for-review
.
Any particular questions you're trying to answer?
My question was what was the approved API shape was Well, forget it, I found https://github.com/dotnet/runtime/issues/782#issuecomment-532341171 (π³), but it might still be useful to re-review it for the additional casings.
JsonKnownNamingPolicy
might also need to be updated.
Well, forget it, I found #782 (comment) (π³),
Ah yes, that's why didn't find the notes -- we approved it but didn't apply the label. Oops. But yeah, we'll take another quick look tomorrow.
Is there any update on this, when can we expect this to be added? I am targeting .NET 6 and can only see CamelCase
as an option..
@OnamChilwan, you can currently use a package mentioned in https://github.com/dotnet/runtime/issues/782#issuecomment-1018535660. The pull request I made uses it's code with style changes only.
@terrajobst, till what date pull request can be merged and appear in .NET 7? If it's too close and no time left for an approval of public API changes we can go with option 2. It allows to trim the PR only to SnakeCase
and add other options later, but with Shouting
(less preferred naming in my opinion).
@OnamChilwan you can use this code too I did 3 years ago, is in some threads about this issue, and I shared it for the people that is looking for a solution with this
Incredible this is still not in .NET7. This is why people still stick to Newtonsoft.
This is pathetic. I can't advocate for System.Text.Json anymore if it's going to remain a worse option than Newtonsoft.
Really @wardboumans @GiorgioG? Using a custom naming policy that already exists out there is that despicable that you'd drop the library? Wow.
namespace System.Text.Json;
public partial class JsonNamingPolicy
{
public static JsonNamingPolicy SnakeCaseLower { get; }
public static JsonNamingPolicy SnakeCaseUpper { get; }
public static JsonNamingPolicy KebabCaseLower { get; }
public static JsonNamingPolicy KebabCaseUpper { get; }
}
public enum JsonKnownNamingPolicy
{
SnakeCaseLower = 2,
SnakeCaseUpper = 3,
KebabCaseLower = 4,
KebabCaseUpper = 5,
}
Cool, thanks! Then I will align the pull request according to it and address requests.
@georgiosd Yes really. I can't tell you how many times I've advocated to use System.Text.Json and then have to defend these basic kind of omissions. JSON.NET has had this since at least 2016: https://github.com/JamesNK/Newtonsoft.Json/commits/master/Src/Newtonsoft.Json/Serialization/SnakeCaseNamingStrategy.cs
Well maybe they should not have optimized the framework and include a snake case naming strategy.
Really, if you're deciding on the basis of having to write a naming policy implementation (that is what, 5 lines?) and not the overall improvements perhaps you have bigger problems in the organization.
Totally agree with @georgiosd
I thought that my implementation would be up and running for 6 months (more or less)? 3 years later... I see the end of this discussion is near
Speaking of naming and in my case, the classes JsonKebabCaseNamingPolicy and JsonSnakeCaseNamingPolicy (KebabCasexxx) (SnakeCasexxx)
As you can see, what Georgios says is a fact
Contents below were taken from @khellang, thanks!
API Proposal
Add additional casing support for
System.Text.Json
, based on this comment.Behavior Proposal
I propose the same behavior as Newtonsoft.Json, just like the existing camel case behavior. The implementation is here and the tests are here.
Other Comments
I think snake case naming is pretty common, especially in the Ruby world, and should be supported out of the box. GitHub's API is probably the most popular example using this naming convention off the top of my head.
Implements
See: dotnet/corefx#40003