aaubry / YamlDotNet

YamlDotNet is a .NET library for YAML
MIT License
2.54k stars 474 forks source link

C#9 records support #571

Open marhoily opened 3 years ago

marhoily commented 3 years ago

I'm always frustrated when I try to deserialize C#9 record types and get "no default constructor" exception. Now that immutable records and nullable reference types are C# features, no decent deserializer can ignore the problem.

Possible solutions:

  1. Figure out it is a record type using attributes C# compiler must generate
  2. If there's only one constructor use it; match argument names
  3. Introduce [YamlConstructor] attribute; match argument names
aaubry commented 3 years ago

The difficulty is that the current architecture makes it difficult to call a constructor. I am working on a refactoring that, among other things, will support this use case.

marhoily commented 3 years ago

Can I contribute? As far as I understood the immediate problem is that the deserializer is trying to instantiate an instance of an object first. Is the refactoring you have mentioned going to go deep first? Do you have a step-by-step plan and a branch you could share or is it in one big lump? Have you investigated if some unsafe CLR tricks could preserve the existing architecture?

aaubry commented 3 years ago

I'm sorry. I wanted to give a better answer, but with the pandemics, I don't have any free time left. I'm not able to share any work on this part right now.

AlseinX commented 3 years ago

As a temporary solution, GetUninitializedObject could be used to create a new instance that is allocated and zeroed, without calling the constructor. Though it requires that all properties with default values should be specified via DefaultValueAttribute, since assignation on the properties won't work with GetUninitializedObject.

visose commented 2 years ago

@marhoily my solution to this has been to use YamlDotNet to deserialize the yaml to object, then serialize to json, then use System.Text.Json to deserialize to the record types.

Droxx commented 1 year ago

I'd like to check if there are any plans to address this? It's still an issue unfortunately.

Or do there exist any known workarounds to this? (other than using an intermediate object 'builder' step)

FyiurAmron commented 1 year ago

@Droxx if you create a 0-arg c-tor calling the main c-tor with default values, it will work. Yes, it's ugly, but it works. You can add a validator method checking for those defaults to see if the record has been deserialized fully. See https://stackoverflow.com/questions/74183086/why-can-i-not-deserialize-yaml-into-a-record/76852663 for my take on it.

Alternatively, do what @visose proposed (although that is even worse IMVHO)

@aaubry any ETA or suggestions for this? I agree that the simplest fix would be to change the code so that GetUninitializedObject or similar feature would be used by default to create a "default" instance of the record if 0-arg c-tor is not found for it (use the 0-arg c-tor if present, though). The rest seems to be working well enough ATM.

I can try doing a PR for this if no-one else has the spare resources for it, but I'm no C# expert, mind me :D

visose commented 1 year ago

although that is even worse IMVHO

Both are bad, but have different trade-offs. If you're using record types and yaml to save key strokes and performance is not an issue, which is very likely, it's not worse.

FyiurAmron commented 1 year ago

@visose

Both are bad, but have different trade-offs.

Well, yeah, kind of. That's why I said "IMVHO" - for me, introducing a potential performance killer (for people e.g. loading large amounts of external data into a game, which many people, especially those using YamlDotNet in Unity, are doing) into an app and e.g. hiding it behind a util helper is a big no-no. Without the helpers, the "saved keystrokes" you mention are not there.

Adding a zero-arg with deprecation warning and doc is usually enough to make it fool-proof. That c-tor would be unusable without reflection anyway, you could only create dummy records through it.

I worked with big teams, I know how those kind of things usually end. For 1-person pet project, I agree it's manageable, though.

If you're using record types and yaml to save key strokes and performance is not an issue, which is very likely, it's not worse.

Key point there, "which is very likely". How would you measure that? What kind of metrics do you have to back it up? Is it the "I think it's likely, therefore it surely is" type of confidence, or can you really prove that most of the critical users (e.g. current and future contributors, consumers maintaining actual applications etc.) of YamlDotNet are only concerned with length of (trivially generated, in this case) code and unconcerned with performance at all?

You might be right on that, but please provide some data before fuelling this discussion further. To-and-from JSON serialization is not trivial in terms of performance, and for non-trivial data it requires e.g. a sizeable string allocation to be done (not to mention CPU time to parse it). Mobile environments are really susceptible to stuttering due to allocations, even if it's just a warmup/startup thing.

Also, I've seen YAMLs that are literally thousands of lines long (in e.g. k8s, pipelines and other DevOps configs, Spring Boot apps, games' entity properties and moddable props, to mention a few). Many of those contexts are irrelevant to YamlDotNet. Some are.

visose commented 1 year ago

I don't believe in data, I'm using my well honed gut feeling.

If you care about performance you already understand the implications of de/serializing twice, but don't use your intuition here, profile it first, you might be surprised. Wait maybe data is better than instinct.

FyiurAmron commented 1 year ago

@visose I explicitly and honestly requested any kind of data from you. If I wanted you to act like a smartarse, I would have told you so. Yet, I don't recall anyone here asking for that, me included.

Still, for reference for those curious about it:

image

image

I got those kind of results repeatedly, whether in real-life scenarios or in microbenchmarks, and regardless of the data size and structure, as long as the processed data itself is big enough in total. JSON to-and-fro adds in the order of ~ +20% execution time here, and that's mostly because YamlDotNet is relatively slow by comparison. Kudos to dotnet team for a performant solution, BTW. Still, no surprise here.

No LOH allocation and about 50% smaller LOH+POH allocation total, not to mention the GC jitter generated by constant re-allocs. About +5-10% total memory footprint here. I could probably find datasets where this would go higher.

I think think this particular discussion has run its course. I don't find it related to the main subject at hand, and I honestly see no way in which it would improve this thread.

visose commented 1 year ago

That's a lot less overhead than anyone expected. If performance is an issue, it's not really because of this workaround. Best thing would be to avoid YamlDotNet if possible.

nev-21 commented 4 months ago

related SO question: https://stackoverflow.com/a/76852663

EdwardCooke commented 2 months ago

I’ll be spending some concentrated time on yamldotnet in the next weeks and plan on working on the required properties.

EdwardCooke commented 2 months ago

Required property enforcement is done. However calling a constructor with arguments is not. I’ll be working on that next. I do have a couple of ideas for it.

I believe you can use a protected constructor with no arguments so it’s hidden. I also tested a basic record class with required properties and fields with no constructor specified and it worked.