Open SteveKCheng opened 3 years ago
Tagging subscribers to this area: @eiriktsarpalis, @layomia See info in area-owners.md if you want to be subscribed.
Author: | SteveKCheng |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Text.Json`, `untriaged` |
Milestone: | - |
I don't have any feedback on the customizing newline scenario - will leave that for the JSON team. But I wanted to focus on this comment:
It is essential when the JSON output bytes are to be compared directly or through a hash, e.g. as a digital signature or for regression-testing expected outputs.
Please be advised that different major versions of System.Text.Json are not guaranteed to have byte-for-byte identical output with one another. This could be because the .NET runtime itself changes how some values are encoded (like how .NET Framework and .NET 5 have different string representations for double), it could be because characters like '€' may or may not be escaped by default, or it could be some other factor.
If you rely on the contents having a very specific byte-for-byte representation, please always test each new version of System.Text.Json (or its dependencies) or each new version of the .NET runtime in an integration environment before deploying to production. This will allow you to catch such discrepancies early.
Hope this helps!
@GrabYourPitchforks Just want to comment: I am fine with the caveat that byte-for-byte comparison only works with the same version of the run-time & libraries. The use case I have right now is regression-testing by comparing expected outputs where some developers might run the program on Windows while the automated periodic process is running on Linux.
@SteveKCheng We know that we aren't going to get to this feature during .NET 7, and compared to many of the other issues/features in our System.Text.Json backlog, I don't expect we'll get to this for the foreseeable future. To reflect this, I'm going to close the issue.
For the regression testing scenario you described, I recommend normalizing the newlines before comparison. As was previously called out, other direct comparisons or consumption for a hash, a digital signature, or other purpose would not be reliable without normalization either.
Reopening based on additional feedback from @damieng that there are scenarios where additional control of the written content is valuable.
@damieng if you could, please add a comment citing your scenarios so that we can consider them and if we proceed on this use them as acceptance criteria.
Background and Motivation
Currently the newline sequence used by
Utf8JsonWriter
depends on the running platform, obtained fromEnvironment.NewLine
.It would be useful to be able to set it explicitly so that JSON output can be reproduced exactly no matter what platform the .NET program is running under. It is essential when the JSON output bytes are to be compared directly or through a hash, e.g. as a digital signature or for regression-testing expected outputs.
Proposed API
The default value is
null
(backwards compatible), which means to take the value fromEnvironment.NewLine
. Otherwise the specified string is used as the newline sequence, as in theTextWriter.NewLine
property.Usage Examples
Alternatives
Currently as
Utf8JsonWriter
does not offer this functionality, I have to use Newtonsoft's JSON writer instead --- which relies onTextWriter
that has a customizableNewLine
property. It seems fairly difficult, and obviously inefficient, to write a "filter" on aStream
to convert the platform-specific newline byte sequence.Risks
I think it is low risk for the most common cases. There is no change to behavior of existing code that does not set the
JsonWriterOptions.NewLine
property.Utf8JsonWriter
's code is already parameterized onEnvironment.NewLine
and this proposal is just asking we stop hard-coding the use of a global property.What if the arbitrary newline sequence contains non-ASCII characters? Or it results in invalid JSON?
My first thought is that such characters should still be supported for completeness (otherwise we'd have to define what the behavior would be) even they might cause slightly slower execution. And it would be consistent with
TextWriter
's API with no friction for the user.But perhaps an enumeration (or boolean?) allowing only for the standard choices of newline sequences would work too. Currently,
Utf8JsonWriter
's code assumesEnvironment.NewLine
is either"\r\n"
or"\n"
. Supporting the choice between these two would be good enough to satisfy the motivation above.