dotnet / efcore

EF Core is a modern object-database mapper for .NET. It supports LINQ queries, change tracking, updates, and schema migrations.
https://docs.microsoft.com/ef/
MIT License
13.61k stars 3.14k forks source link

Cosmos: Allow to use a custom JSON serializer #17306

Open AndriySvyryd opened 5 years ago

AndriySvyryd commented 5 years ago

Allow value converters to output a raw object (e.g. JToken)

Also provide access to unmapped properties (through a JSON shadow property or dynamic property bag).

Make sure that FromSqlRaw parameters are properly serialized.

Test with STJ source generators

Related to #14570, #30677

ajcvickers commented 3 years ago

Note from triage: this is currently blocked on the new Cosmos SDK, tracked by #18753.

bjorn-ali-goransson commented 2 years ago

Yes please! This way I will be able to serialize an IList<ISomeCustomInterface> - there and back again ...

vyarymovych commented 2 years ago

Has anything been done to be able to use custom JSON serializer? We've built a lot of functionality using EF Cosmos DB provider, it's a pain now to change document schema. If we want to add new property (required) and it's not nullable, EF queries just crash (it works great when using SDK). We really need custom JSON serializer to control that and return default value in case that property doesn't exist in existing cosmos document

roji commented 2 years ago

@vyarymovych as a workaround specifically for your case, you could have a private nullable backing field (which EF Core will read and write from), and have a non-nullable property wrapping that and coalescing nulls coming from Cosmos to whatever default value you want. It seems similar to what you're looking for, but doing the coalescing in your property instead of in the JSON serializer.

vyarymovych commented 2 years ago

Thanks @roji! This is really helpful. I think I've seen that approach somewhere on the internet, it seemed a bit old-school to me and just wanted to check whether other solutions exist. Just one more related question, is it going to work if the requirement is to implement server side filtering by that property? For example, I want to implement soft-delete by addingbool IsDeleted property and all the queries must apply .Where(x => !x.IsDeleted) condition?

roji commented 2 years ago

I'm not 100% sure what Cosmos does for the SQL we generate for Where(x => !x.IsDeleted), in the case where the property doesn't exist - that should be easy to check though. Note that in this respect, coalescing with the JSON serializer or with a property are the same.

athinton commented 9 months ago

If I understand this correctly, PR #24334 was not merged because a better solution is to upgrade to a newer version of the SDK i.e. #18753. Unfortunately, the comments on this issue indicate that V4 has been abandoned.

Is my understanding correct? What does this mean for the possibility of customising the serialisation options?

AndriySvyryd commented 9 months ago

We will add some way of customizing the serializer, but for now it's still undecided how that will look.

klemmchr commented 7 months ago

Is there any progress on this? Would love to use System.Text.Json to reduce the memory footprint. Newtonsoft.Json is hitting really hard.

ajcvickers commented 7 months ago

@klemmchr Make sure to vote (👍) for the issue. The more votes, the higher priority is gets. This issue currently has 16 votes, which means 193 other issues have more votes and are above it in the priority list.

klemmchr commented 7 months ago

So prioritization is solely based on some upvotes? You're currently forcing users to use a third party dependency with horrible performance while you're having a Microsoft based serializer in place. It's really not understandable how this situation is accepted.

ajcvickers commented 7 months ago

So prioritization is solely based on some upvotes?

No, it's one input into the planning process.

roji commented 7 months ago

@klemmchr note that Newtonsoft.Json is a dependency of the Cosmos SDK itself, so not something EF itself controls. I also doubt that the difference between Newtonsoft.Json vs. System.Text.Json would result in "horrible performance" specifically when working against Cosmos - is that statement backed by actual benchmarking?

klemmchr commented 7 months ago

note that Newtonsoft.Json is a dependency of the Cosmos SDK itself, so not something EF itself controls.

The Cosmos SDK does not force me to utilize Newtonsoft.Json. I can plug in any deserializer I want to. Plus, I can retrieve data as a stream directly.

I also doubt that the difference between Newtonsoft.Json vs. System.Text.Json would result in "horrible performance" specifically when working against Cosmos - is that statement backed by actual benchmarking?

There are tons of benchmarks out there. Microsoft is bragging about their performance gain compared to Newtonsoft due to new API that is used, reducing both compute time and memory allocation. The quoted benchmark is five years old, the gap only got bigger since. I did a memory analysis on one of our applications and did find out that de-/serialization takes up tons of memory using Newtonsoft. That footprint would be way lower when I could use a different serializer.

roji commented 7 months ago

There are tons of benchmarks out there.

My comment above was whether the performance difference here would be actually meaninful when working with Cosmos specifically, given that you're transferring data over the network to a cloud service that's performing I/O to search and execute queries. I'd recommend avoiding assumptions about perf without actually checking.

klemmchr commented 7 months ago

You're confusing latency with resource utilization. I'm explicitly talking about CPU and RAM usage and not about latency issues (that are common for all cloud services). I'm not making assumptions, I'm citing benchmarks done by Microsoft. There are tons of other benchmarks out there too. I don't see where this is going given that you're just ignoring the reason why Newtonsoft is bad as a hardcoded serializer.

I just don't get this argument. You're basically saying that all other performance factors are completely irrelevant once I/O and network is involved. This would mean that we don't need any more performance improvements for ASP.NET or Microsoft.Data.SqlClient. This is just weird to argue about.

roji commented 7 months ago

@klemmchr I did not mention latency vs. resource utilization - the question is whether the resource utilization you're referring to, e.g. the extra allocations, are meaningful to the overall performance specifically when interacting with Cosmos.

In other words, you may very well find out that switching from Newtonsoft.Json to System.Text.Json considerably reduces memory allocations, but does nothing to your general application performance, since that improvement is negligible compared to everything else that has to happen to service a Cosmos request. If that's the case, then spending time on this feature for performance reasons isn't worth it (it may be worth it for other reasons, e.g. functionality differences between the two serializers).

I just don't get this argument. You're basically saying that all other performance factors are completely irrelevant once I/O and network is involved. This would mean that we don't need any more performance improvements for ASP.NET or Microsoft.Data.SqlClient.

I'm not. It's good to be concerned with performance, but it's important to always have a holistic view of performance when optimizing things - especially when considering priority (i.e. where we invest our time). If reducing a few allocations in a path where those are negligible doesn't actually get you anything, then that's not something we should prioritize. Otherwise we can spend a lot of time on meaningless micro-optimizations which technically improve performance, but not in any way that matters to anyone. I hope that's clear.

To summarize, I think this is something we should do in general, but I'd be surprised if it would lead to a meaningful improvement perf-wise when working with Cosmos. I'd be happy to see actual figures to the contrary though.