dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

DateTime Serialization loses Kind information #552

Closed brinkrob closed 9 years ago

brinkrob commented 9 years ago

When serializing a DateTime object between silos/clients Orleans stores just the Ticks value losing the kind/timezone information.

Currently only the ticks are written to the serialization stream.

If the source DateTime object is of kind "Local" the ticks are local as well. On the other side of the serialization, a DateTime object is created without the kind information resulting in a kind of "Unspecified" which is usually treated as UTC. Thus ticks in local time go in, and ticks in unspecified/UTC time come out.

My proposal:

When serialization is taking place, convert the DateTime to UTC and write the ticks as UTC. Also write the kind attribute to the stream.

On the Deserialize side, use the UTC ticks and the kind to recreate a DateTime object that is similar to the original. The kind (local/utc) would be the same, it would refer to the same UTC time, but the ticks/timezone may be different and therefore a basic ToString of the time object would output a different time.

brinkrob commented 9 years ago

I will try to make a pull request for this issue soon.

sergeybykov commented 9 years ago

Thank you, Rob!

We discussed this on Gitter with Rob. Seems like a sensible plan to me that shouldn't break existing code.

raybooysen commented 9 years ago

What are your thoughts on an ISO 8601 string? Or is the serialisation cost to a string too high? This means we don't lose any timezone info and everything is serialised.

raybooysen commented 9 years ago

"The kind (local/utc) would be the same, it would refer to the same UTC time, but the ticks/timezone may be different and therefore a basic ToString of the time object would output a different time"

This is an interesting statement. We shouldn't ever be losing data in the serialisation/deserialisation steps.

This is in .NET 4.5 https://msdn.microsoft.com/en-us/library/system.datetime.frombinary(v=vs.110).aspx. Very interesting that it also can misrepresent DateTimes when serialising/deserialising.

Liversage commented 9 years ago

I am not convinced that converting the DateTime to UTC before serializing the ticks is a good idea. While I cannot come up with a good use case for using a local DateTime across a serialization boundary I would still expect the deserialized DateTime to be exactly the same on both sides of the boundary.

When the time zones are different on the system where serialization takes place compared to the system where deserialization takes place a local DateTime will change as a result of serialization if the proposed algorithm is used.

While using a local DateTime across systems in different time zones can be seen as asking for trouble an additional "hidden" conversion taking place in the serialization layer may make this setup even more error prone.

Why not simply serialize the ticks and the kind and reconstruct the DateTime from this information and leave any local time zone issues to be handled outside the framework?

attilah commented 9 years ago

As others also wrote, serializing ticks is not enought since you miss out TZ information.

My preference is that within the system only use and persist UTC date time values, and I preferred to use DateTimeOffset. When I need a local friendly datetime on the client, then I will convert it during the outbound mapping (DTO conversion for example).

Then later on I discovered NodaTime (http://nodatime.org) which has a serialization friendly type Instant, and you can use that everywhere. It is using TZDB (the official Timezone Database) to handle timezone support.

I know that it is a 3rd party component, but Newtonsoft.Json is that too ;-)

One of the benefits of using NodaTime that it is support mocking of IClock, so time travel tests are possible automagically ;-)

What do you think on integration of NodaTime?

raybooysen commented 9 years ago

Sounds like a big dependency for just a serialisation requirement. Why not serialise as an ISO 8601 string? Everything done then.

attilah commented 9 years ago

I think that standardizing such library for DateTime handling in a distributed system like Orleans is not just affecting serialization, but can solve a very crucial problem.

Yes, in this point of view it solves "just" the serialization.

Define "big" as you refer to it ;-) it is just 1 standalone lib.

Liversage commented 9 years ago

The ".NET" solution to the problem of DateTime only being able to represent "local" time without knowing the offset of "local" and UTC time is to use DateTimeOffset. While NodaTime may have some additional functionality DateTimeOffset contains enough information to unambigously represent an absolute point in time.

attilah commented 9 years ago

DateTimeOffset CAN be used, that is what I've used before, but the serialization of it need to be handled manually, that's it.

And it is not mockable.

brinkrob commented 9 years ago

The built in DateTime.ToBinary and DateTime.FromBinary mentioned in the link from @raybooysen appear to implement my proposed solution, and I'd probably use them if we stick with that original idea.

From the docs:

For example, consider a DateTime object that represents a local time of 3 P.M. An application that is executing in the U.S. Pacific Time zone uses the ToBinary method to convert that DateTime object to a binary value. Another application that is executing in the U.S. Eastern Time zone uses the FromBinary method to convert the binary value to a new DateTime object. The value of the new DateTime object is 6 P.M., which represents the same point in time as the original 3 P.M. value, but is adjusted to local time in the Eastern Time zone.

Are we OK with serialization having this behavior? I personally am. The DateTime object would always refer to the same instant of time. The source timezone information would be lost, but I don't think that is the essence of the data anyways. Even if we used a DateTimeOffset method for serialization, we'd still lose the timezone information (we'd just have a offset). And I'm not even sure how you'd convert from a DateTime -> DateTimeOffset -> DateTime anyways that does not result in the same behavior as just using the ToBinary/FromBinary.

You could make the case that the clients should just be using DateTimeOffset in the first place, and maybe they should be. But in the case that they use DateTime I think ToBinary/FromBinary gives them correct behavior.

They do mention on that page the wrinkle of Daylight Savings where a certain time may not exist in all timezones. I'm not sure I understand the implications of that yet. I tend to think that it's OK since the local object does not represent a real time.

veikkoeeva commented 9 years ago

I'm all right with having the semantics of DateTime.FromBinary, though it should be mentioned somewhere visibly enough. Serializing DateTime, I think, is notoriously difficult in any event as a quick run through a seach engine tells (e.g. DateTime and xs:date, WCF and whatnot). Even DateTimeOffset is not problematic. For some food for thoughts (to have a more round discussion):

Maybe an thing to implement is an efficient serializer for Yoda types. What do you think @attilah?

attilah commented 9 years ago

FromBinary. ToBinary will stick us with DateTime class and when Orleans will have support for custom serializers, this problem will be rise again ;-)

attilah commented 9 years ago

@veikkoeeva NodaTime already has support for Binary, Json and even Xml serialization ;-)

http://nodatime.org/unstable/userguide/serialization.html

brinkrob commented 9 years ago

Regardless of if Orleans itself starts using NodaTime, this still needs to be fixed for DateTime somehow. We found this problem when using a DateTime object being passed from our web front end into Orleans.

Orleans can't "ban" clients from using DateTime in their grain code, but it should at least serialize it properly for those that do.

attilah commented 9 years ago

@brinkrob in that sense I totally agree, but this way orleans already HAS its place for serializing DateTime type, since somewhere it need to now that only the tick part is serialized (in BinaryTokenStreamWriter.Write (DateTime) method.

Since THIS kind of problem only for orleans builtin native serialization and does not affect how other serializers will work, they just need to make sure that handling the same situation.

The fix is not that easy because of how DateTime local time works. The local time conversion which is automatic by ToBinary and FromBinary methods may have undesired side effects since those methods are adding behavior beside serialiation, which is "wrong".

What do you think of this:

if datetime is not local -> use ToBinary, FromBinary if datetime is local -> ticks and minute offset to utc (date time offset) serialization if datetime is unspecified -> ticks and kind serialization

Maybe Orleans should remove DateTime from the builtin supported types and just provide a default implementation for DateTime and DateTimeOffset too.

I know that this is a breaking change, but the usage of DateTime class can not happen in such cases without sideeffects and Orleans itself can not be a decision maker on the developer's behalf.

brinkrob commented 9 years ago

Does the BinaryTokenStreamWriter and BinaryTokenSreamReader classes not get used for something where a grain invocation with an argument of type DateTime?

The problem with the suggestion by @attilah above is when the datetime is local. Since a serialization/deserialization cycle must start and end at a DateTime, there is no intermediate format that can properly translate that in a loss less way. The DateTime type is just busted in this respect. In fact what you are suggesting is impossible since there is no way to represent something in a different timezone with a DateTime object. If a value of 1pm ET goes in on a machine in the eastern timezone, it is impossible to get 1pm ET out on a machine in the Pacific timezone. The best we can do would be 10am PT, which is what ToBinary/FromBinary does.

gabikliot commented 9 years ago

Why would not we just implement something that is logically equivalent to what default binary .NET serializer is doing? Any time that we were in doubt in the past regarding how to serialize, we just mimicked what .NET serialization is doing, of course with our optimized implementation.

attilah commented 9 years ago

If we agree on that local time conversion default behavior of .Net then it is quite simple to implement I think. And from Orleans point of view this is what it should do as a framework.

On the other hand, whoever is using DateTime is their bad :D

gabikliot commented 9 years ago

@attilah , I think whomever uses time in general in a distributed systems - it is up to them to understand the implications. The framework should try to do as less magic as possible in that regards. Therefore, I am suggesting we do what .NET serializer is doing, and just stay out of business of solving that problem (since there is no one, correct solution). Internally, within Orleans, when ever we use DateTime we do the same - reason about it and make sure we understand the implications of distribution.

To summarize - if default .NET serializer is doing what @brinkrob suggested - behaves like `DateTime.FromBinary , than that is what I suggest we implement.

attilah commented 9 years ago

Ok, that's the same what I've said, but I think that the DateTime serializer method needs a remark in form of comments, to immediately direct the interest of the developer to the behavior of DateTime ;-)

Meanwhile someone is doing this modification, the addition of DateTimeOffset support as builtin type would be awesome too ;-)

gabikliot commented 9 years ago

Agree with all you said @attilah.

ElanHasson commented 9 years ago

I'm in agreement as well.

veikkoeeva commented 9 years ago

:+1: On what @attilah is suggesting: print a warning on using DateTime. It would jive well with the sprit of Orleans guiding away from problems and being easy to adopt.

It would be awesome if there were a few word summary of potential problems and/or suggested alternatives or permanent sources for more information, but that may be pushing it. :)