dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.96k stars 4.65k forks source link

System.Text.Json Add a feature that allows gathering all exceptions encountered when parsing json #38049

Open stevendarby opened 4 years ago

stevendarby commented 4 years ago

Newtonsoft.Json has a ErrorHandler call back that lets you inspect and continue on error. MVC uses this to report all the errors during JSON deserialization

The ask is to consider doing something similar for S.T.J


Original issue text

Describe the bug

If my request model has a DateTime but an invalid date string is supplied in the JSON, an error is added to the ModelState, e.g. "The JSON value could not be converted to System.DateTime. Path: $.dateProperty | LineNumber: 1 | BytePositionInLine: 43."

No other validation messages are added because model binding appears to completely halt at this step. If I disable the automatic 400 response I can see the model is null.

This differs from Newtonsoft - invalid dates would add an error message but the model binding and validation process would continue. This allows you to catch and report all validation errors to the user. With STJ, if there are two invalid dates, the user can only find out about one at a time.

As the JSON as a whole is not malformed, it's just that the format of one of the values is wrong, I really think the model binding/validation should continue.

Further technical details

Runtime Environment: OS Name: Windows OS Version: 10.0.18362 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\3.1.101\

Host (useful for support): Version: 3.1.1 Commit: a1388f194c

.NET Core SDKs installed: 2.0.3 [C:\Program Files\dotnet\sdk] 2.1.102 [C:\Program Files\dotnet\sdk] 2.1.104 [C:\Program Files\dotnet\sdk] 2.1.200 [C:\Program Files\dotnet\sdk] 2.1.202 [C:\Program Files\dotnet\sdk] 2.1.402 [C:\Program Files\dotnet\sdk] 2.1.508 [C:\Program Files\dotnet\sdk] 2.1.701 [C:\Program Files\dotnet\sdk] 2.2.108 [C:\Program Files\dotnet\sdk] 2.2.301 [C:\Program Files\dotnet\sdk] 3.0.100 [C:\Program Files\dotnet\sdk] 3.1.101 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed: Microsoft.AspNetCore.All 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 2.0.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.0.6 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.0.7 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.1.15 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.2.6 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.2.8 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 3.0.0 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 3.1.1 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]

stevendarby commented 4 years ago

Digging around the source code a bit, the Newtonsoft deserializer doesn't stop on exceptions and it allows you to attach an error handler to deal with exceptions that occur during deserialization. The input formatter makes use of that error handler to add errors to the ModelState while the deserialization process continues to the end.

The STJ deserializer stops on the first exception, and while that exception is added to the ModelState (if an appropriate exception type) it means you cannot build up a complete ModelState of all the errors.

I would really like to see the Newtonsoft feature come to STJ / the STJ input formatter.

pranavkm commented 4 years ago

@ericstj would you consider a feature like this for System.Text.Json?

Newtonsoft.Json has a ErrorHandler call back that lets you inspect and continue on error. MVC uses this to report all the errors during JSON deserialization

ericstj commented 4 years ago

I'm not sure we've considered such a feature before. @sharter @layomia what do you think?

steveharter commented 4 years ago

This was discussed early on when we evaluated Newtonsoft features for priority and this feature was discouraged, although the discussion did say that MVC uses it. @JamesNK @glennc

JamesNK commented 4 years ago

I'm not a fan of ErrorHandler. Attempting to resume after an error is too unreliable.

stevendarby commented 4 years ago

Surely one can conceive of there existing a set of errors that can be considered safe to resume from, making it then a case of deciding which errors fall in this set. Newtonsoft may have tried to resume from too many different types of error, making it unreliable. Could STJ implement this feature more conservatively and therefore more reliably?

The particular DateTime example I've given here seems a good example of a resumable error. If I wanted to safely convert a string to a DateTime I would just use one of the TryParse methods - can't the deserialization process take a similar approach?

This extends to JsonConverters too, where there is no obvious direct means of reporting what I would class as a validation error. Ideally, as you're using a strongly typed language, you want a date and time property on your model to be a DateTime, therefore you rely on the string in the JSON to be converted at some point. But that means the point of conversion is really the only point at which you can validate the string format.

You might want to enforce a particular ISO 8601 compliant representation in requests, so you create a JsonConverter to parse the string with an exact format. If it fails, there is no real option other than to let the FormatException be caught by the MVC input formatter so it can be added to the model state. If the user has made multiple date format mistakes it seems reasonable that these all be reported at once, which isn't possible with the current approach of bombing out entirely.

A workaround might be to have this as a string property on your model and validate it after model minding, but not using a DateTime just to get around this doesn't seem like a reasonable workaround.

I'm considering DateTime in particular because it's the example I've come across but I think it could extend to other sorts of conversions too.

stevendarby commented 4 years ago

Just wondering if this came up in “next sprint planning”?

pranavkm commented 4 years ago

@snappyfoo not as yet. That said, it's fairly unlikely MVC could \ would do anything if the feature does not exist in the formatter.

stevendarby commented 4 years ago

@pranavkm Thanks for the update. Is there a more appropriate repo to raise the issue with the formatter, and do you think it’s worth doing that?

pranavkm commented 4 years ago

Let me transfer the issue to the runtime repo with a rewording. If they decide to add a feature, we could take advantage of it in MVC.

Dotnet-GitSync-Bot commented 4 years ago

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.

chrisjmccrum commented 2 years ago

+1 Looking for this feature - I now need to import Newtonsoft into my project for one edge case where the json maybe malformed. Having the option to continue to serialization and default the properties with errors would be super-helpful. I found this issue when searching SO ( https://stackoverflow.com/questions/26107656/ignore-parsing-errors-during-json-net-data-parsing)

penguinawesome commented 2 years ago

+1 for this feature. This feature is very important to determine all the fields that are invalid.

ysiivan commented 2 years ago

+1 The inability to act (ignore or otherwise) on an exception thrown when serializing a property is a show stopper for me.

pilouk commented 2 years ago

+1

stevendarby commented 2 years ago

Can everyone who has or wants to reply with +1 please (also?) thumbs up the original post as it influences priority.

See https://github.com/dotnet/runtime/blob/main/CONTRIBUTING.md#finding-existing-issues

Thanks

aaronegger commented 2 years ago

+1

akovanev commented 2 years ago

A must have feature for api calls when some data may come inconsistent. It's a shame that it's still not resolved.

clane2812 commented 1 year ago

A must have: we have a postgres with some 'invalid' json on a special entity - when deserialzing the data from the database an exception is stored on the entity and thrown on the first access to the object. We can handle the exception in the use case of reading from the database, but what is the way to overwrite the invalid data with some valid json? When accessing the database for writing, the above exception occurs directly. Perhaps i have missed something, but actually it seems to me that i cannot write and fix my database without ignoring this deserialization-exception...

hosseinitabar commented 1 year ago

It's a shame that after 2 years of this issue, Microsoft has not implemented this. It's a MUST requirement. It's not a choice. Of course in any data serialization or deserialization or any batch operation one must have the capability to get a report.

gokhanabatay commented 1 year ago

+1 I agree others https://github.com/dotnet/runtime/issues/44390

lancer1977 commented 1 year ago

+1

mpdelbuono commented 1 year ago

I'm not a fan of ErrorHandler. Attempting to resume after an error is too unreliable.

@JamesNK I'm hoping that this can be reconsidered - there are scenarios beyond "attempting to resume" where an error handler is important. Consider the case where you're reading from a read-once stream (quite common in web applications). If deserialization fails, you now get an exception saying what happened, but you have no way of knowing what the data was that caused the failure. Even if you catch the exception, there's no way for you to see the data because the stream is already consumed.

A proper error handling hook would provide that information saying "I couldn't deserialize this, you figure it out". It doesn't have to enable continuing from this point, but it would at least enable logging to saying "here's the thing that couldn't be deserialized" so you can figure out what's wrong.

kampilan commented 1 year ago

I use this in my logging framework to ignore properties that throw an Exception simply for being read. Timeout on MemoryStream for example.

@JamesNK Not being a fan as a reason to not implement is paternalism run amok. You were a fan of it enough to have originally implemented it.

+1

JamesNK commented 1 year ago

I was a fan of this feature. That's why I added it. Then I discovered how many problems it has.

Error handling created dozens of bugs that needed to be fixed because handling errors would leave the deserializer or reader in an unexpected state. There are still bugs in Newtonsoft.Json from this, and the feature is 10 years old.

Another problem was handling errors causing the deserializer to go into an infinite loop. For example, a deserializer is configured to handle errors, but the error comes from the underlying stream being closed. The deserializer will DOS itself, ignoring the stream-closed errors infinitely. There are many situations like this when parsing, and each one needs to be thought about carefully or you've created a really easy way to take down a server.

akovanev commented 1 year ago

@JamesNK

It may have problems, that's fine, as you don't force developers to use this feature. However, there are many different scenarios where it is difficult to underestimate it.

Let me give you a real use case. One day you have to import more than 100 thousand items with lots of properties. The import procedure takes up to 3h on you local environment and a little faster on prod. You expect that you know all about the format but you actually don't. Of course you don't deserialize 100K objects with one operation but still you struggle to achieve as less operations as possible. So you deserialize collections. And then it appears that about 500 items of 100K have the unexpected format for you. If there was no error handling feature like this in Newtonsoft, I don't know how much effort and time it would take to just log and find all the issues. And then after the big update you continue catch some nice portions of data for some period after. And no, you cannot update the datasource explicitly, so the only way is desereialize and process.

That would be really helpful if System.Text.Json could somehow give this option. It is still a framework feature.

kampilan commented 1 year ago

Thanks for the clarification James.

All great reasons not to use it. But I have none of those problems. It works exactly the way I need it to work. If I choose to do something stupid and bring down a server that's my problem. Let me make that choice. If I ask for 6 feet of rope I don't expect the clerk to ask me if I am going to hang myself before selling it to me.. This is not unreasonable functionality. And it's totally opt in. You have to ask for it. Buyer beware.

Thanks Jim

osexpert commented 1 year ago

For logging, it would be nice to have option to ignore fields/properties that can not be serialized due too not allowed to (eg. Type, MethodInfo) or because no formatter exist for the type. Could choose to ignore via a delegate where we can choose what to do (eg. Omit field/property, blank it/to null, use ToString etc)

Edit: but Maybe this should be split into 2 issues (one for serialize and one for deserialize, it is completely different issues, where the deserialize invalid data problem is probably 100x more work than the serialize unsuported data problem)

ultra133 commented 1 year ago

+1 from me It's really annoying when Deserialize just returns null without giving any indication of the error. This makes System.Text.Json unusable for real-world development, where you mostly don't have any influence on the producer of the json you have to consume. Error handling is a must have. I don't understand why this issue is being put on the back burner here.

Abbossbek commented 11 months ago

In my case, this is very necessary. I have a list of files in the object, 100 thousand files are parsed at once. In one file list, the files became a nested list, because the back end is written in node js. So I can't parse anything as a single error. These files are not important, but it's a pity that I can't get anything. Please add this feature.

alimoh1372 commented 7 months ago

+1 I need it in my logging System.I want serialize the object passed to the logging system.when error occurred on one or more field, I need to continue serializing and ignore that field. just like Newtonsoft.Json it can Chooseable,and default value false for this feature.

ultra133 commented 7 months ago

It doesn't look like the people behind System.Text.Json will ever rise to the challenge of integrating a working error handler. They may close the ticket with a simple piece of advice: Use Newtonsoft.

colindawson commented 4 months ago

I've just hit this same problem. I'm trying to write some generic error handle, which needs to be able to serialise anything. I want to have a general error handler, so that anything that any property that cannot be serialised because of say a PlatformSupportException is simply ignored completely, as I don't need 100% covarage, it's more important that it doesn't throw an exception or run out of memory (which is what netwonsoft does.

Demmon98 commented 3 months ago

+1 from me

tomasz-majewski commented 3 months ago

+1

daniulian94 commented 2 months ago

+1

Muthu2627 commented 3 weeks ago

+1

dlidstrom commented 2 weeks ago

I too came looking for error handling support when deserializing. However I now realize that it may be possible to combine STJ with System.ComponentModel.DataAnnotations and do a post-deserialize validation. That's a good enough solution in many cases, IMO.