dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.17k stars 4.72k forks source link

Need a way to parse a date string that contains optional date (but has time) and an offset #1179

Open ChadNedzlek opened 4 years ago

ChadNedzlek commented 4 years ago

Right now, I don't think there is any way to parse a string that would be something that might have the date (or might not) and might contain an offset (or not) "16:23:00+700" or "2019-12-25T16:23:00+700" in the framework and preserve that information.

We can use DateTime.Parse(AdjustToUtc | NoCurrentDateDefault), but that loses the existing timezone information (and can't tell us if nothing was there).

For some reason, DateTimeOffset.Parse throws an exception if "NoCurrentDateDefault" is set. I don't know why it can't have it do what datetime does of just returning 1/1/1 as the date, which we can infer means "no date specified", because writing down 1/1/1 in a date doesn't make any sense, since no such date existed.

Or maybe we need a whole new method that returns the 3 parts in isolation (the date, the time, and the offset), so that we can take proper actions based on which parts are missing. It looks like the internal type "DateTimeParse" has much more powerful parsing options, but those aren't exposed).

Our scenarios is that we are trying to read human entered timestamps from a dated record. If the human typed "1 PM", we need to assume the meant in the same day as the existing record". I think we could do this now with a huge array of "TryParseExact" of a few dozen built in formats, but it would be nice to get the magic of TryParse without losing the information about which parts of the date were present, and which part were inferred without our knowledge.

Clockwork-Muse commented 4 years ago

Right now, I don't think there is any way to parse a string that would be something that might have the date (or might not) and might contain an offset (or not) "16:23:00+700" or "2019-12-25T16:23:00+700" in the framework and preserve that information.

We don't have a dedicated Time (or Date) class, no, although there's been some debate around them. You might have better luck with NodaTime, although even there you'd likely have to build a custom parser.

We can use DateTime.Parse(AdjustToUtc | NoCurrentDateDefault), but that loses the existing timezone information

This loses the offset information, not the timezone information (which is way more important).

because writing down 1/1/1 in a date doesn't make any sense, since no such date existed.

While it's true that the Gregorian calendar wasn't in effect at the time the equivalent date occurred, it's not quite correct to say that that date doesn't exist. DateTime(Offset) is proleptic, that is, it posits the rules going forwards and backwards for all time. Some of this is for simplicity in calculation, and a bunch of it is to simplify the history that was the adoption of the Gregorian calendar (which took multiple centuries). It makes talking about the past much easier.

I think we could do this now with a huge array of "TryParseExact" of a few dozen built in formats,

... you'd want some sort of natural language processor, not a straight list of formats. If this is truly freeform, people are going to be writing all sorts of wacky things, like:

I'm assuming you don't have the option to avoid the problem by doing something like supplying a form that would allow you to specify other points in time.

If the human typed "1 PM", we need to assume the meant in the same day as the existing record".

... and then they type "2:30 AM" on a day with DST, and you're even more toast then you were before.

ChadNedzlek commented 4 years ago

Unfortunately, no. For the scenario I ran into where I hit small walls at every turn, we are literally parsing free form text, looking for a pattern of "Started: [some date/time]". But humans being humans, we were hoping to be able to handle if they typed "Started: 15:30" and be helpful. It was just... odd... that DateTime loses the timezone/offset, but DateTimeOffset would preserve the offset, but, for some reason, throws if you pass "NoCurrentDateDefault", so of my two options, both fail me in small, but blocking, ways.

Mostly I was hoping this issue would get resolved by just removing that exception from DateTimeOffset.Parse and having the same behavior as DateTime (just set 1/1/1 as the date portion). But I was open to there being some good reason that's not possible (or a better solution being proposed). It's very frustrating that I either have to call exact and force a format, or have no way of knowing if the date was even present, or was invented out of whole cloth.

Heck, I'd even be fine if the behavior was "NoCurrentDateDefault" caused it to throw if no date was given. So then I could at least provide a useful error message while still accepting a wide range of easily parsed formats (and then I could also just parse it again without that parameter, and replace the date myself). Right now I need to do parsedValue.Date == DateTimeOffset.Now.Date, and hope that doesn't produce incorrect values (which I 100% know it will for many, many scenarios, but I don't have a choice other than to just... hope).

We certainly can just fail to parse times alone and gently scold users and force them to enter the current date in the 99% of the cases where they are talking about 'today' (where "today" is "the day I typed this text", which is recorded in the medium we are parsing), but that is error prone (I'm personally terrible and remembering today's date, so I'm likely to type the wrong thing) and mostly it's just a little sad that the functionality seems to exist in the source code, but I can't get at it because of the exception. I'd be more than happy to assume that in my scenario, people aren't talking about the reverse calculated date of January 1st, in the year 1. I'm willing to bet in 99.99% of usages of DateTimeOffset, no one is meaningfully trying to represent that date in history while parsing date times... and even if the were my proposed change wouldn't change their scenario (since right now it just throws an exception anyway), so that value is basically the same as "the null date". With that combined with the ability to choose to assume local/utc, my scenario would be met for all but the most ill-formed customer data, and I'm more than willing to tell my users that we have a huge array of options, but "one hour ago" isn't one of them.

bartonjs commented 4 years ago

@tarekgh Do you know why DateTimeOffset doesn't allow NoCurrentDateDefault? https://github.com/dotnet/runtime/blob/72b871dac31e8d5dc4f4b4a96948afa0e681474c/src/libraries/System.Private.CoreLib/src/System/DateTimeOffset.cs#L809-L812

tarekgh commented 4 years ago

Do you know why DateTimeOffset doesn't allow NoCurrentDateDefault?

I am not sure but this may be to avoid going before 1/1/1:0:0:0 if the Utc offset is negative.

VladimirRybalko commented 2 years ago

Hi @joperezr Could you please have a look at this and include it into the next public release. It looks ridiculous that the method throws an exception instead of simply return 1/1/1. It will be just one line changes, so please don't put it into the tail of backlog.

Thank you in advance.

tarekgh commented 2 years ago

It looks ridiculous that the method throws an exception instead of simply return 1/1/1.

I assume you meant when passing DateTimeStyles.NoCurrentDateDefault option. right?

It will be just one line changes, so please don't put it into the tail of backlog.

I don't think it is trivial as you think. what happen when I give you a string like 4:23:00-700? Do you expect we'll throw at that time? If yes, this means, most of the negative offsets will throw which I don't think would be a good idea.

The behavior needs to be defined first.