HowardHinnant / date

A date and time library based on the C++11/14/17 <chrono> header
Other
3.07k stars 669 forks source link

date::parse() how to handle different date-time formats. #832

Closed TrueWodzu closed 6 hours ago

TrueWodzu commented 3 days ago

Hi,

As we know different languages/libraries emit different date-time strings. I have a couple of examples here and none of them are working when parsing the standard way:

        std::chrono::time_point<std::chrono::high_resolution_clock, std::chrono::milliseconds> tp;
    std::istringstream in1{"2024-06-28 06:18:26.028111Z"};
    in1 >> date::parse("%F %T %z", tp);
    if (in1.fail())
        std::cout << "failed 1" << std::endl;

    std::istringstream in2{"2024-06-28 06:18:26.028Z"};
    in2 >> date::parse("%F %T %z", tp);
    if (in2.fail())
        std::cout << "failed 2"  << std::endl;

    std::istringstream in3{"2024-06-28T06:18:26.028+02:00"};
    in3 >> date::parse("%F %T %z", tp);
    if (in3.fail())
        std::cout << "failed 3"  << std::endl;

All of the above seem to be in ISO standard. Is it possible to develop a nice one-fits-all solution, for the above? The third case was nicely handled here https://github.com/HowardHinnant/date/issues/824 but how to marry that up with case 1 and 2? And at the same time to still be able to parse date-time like this one: "2023-10-23 12:38:40.555+02:00"?

HowardHinnant commented 3 days ago

high_resolution_clock has no portable relationship to the civil calendar. Thus this code is not guaranteed to even compile for you. If it is compiling, that means that high_resolution_clock is a type alias for system_clock on your platform. And this is only true when using gcc. Note that the latest gcc has fully implemented this library in C++20 chrono and using that is my recommendation.

Here is how you could wrap these three choices up into a single function.

To do so there are a few observations:

  1. The first two options will successfully parse with the format "%F %TZ" if the time point has microseconds resolution. As long as the fractional seconds field ends with EOF or a non-digit, one can parse a lower-precision stream into a higher-precision time point or duration. And you need just a bare Z to match the trailing Z in both of these examples.
  2. The difference between the first two and third examples is demarcated by a ` vs aT` after the date. One can use that to switch formats mid-stream.
#include "date/date.h"
#include <chrono>
#include <iostream>
#include <sstream>

using TimePoint = date::sys_time<std::chrono::microseconds>;

TimePoint
parseISO8601(std::istream& is)
{
    date::sys_days td;
    TimePoint::duration tod;
    std::chrono::minutes offset{};
    is >> date::parse("%F", td);
    if (is.fail())
        throw std::runtime_error("failed to parse date");
    auto T = static_cast<char>(is.get());
    if (T == ' ')
    {
        is >> date::parse("%TZ", tod);
    }
    else if (T == 'T')
    {
        is >> date::parse("%T%Ez", tod, offset);
    }
    else
        throw std::runtime_error("Failed to parse ' ' or 'T' after date");
    if (is.fail())
        throw std::runtime_error("failed to parse time of day");
    return td + tod - offset;
}

int
main()
{
    using date::operator<<;
    TimePoint tp;
    std::istringstream in1{"2024-06-28 06:18:26.028111Z"};
    std::cout << parseISO8601(in1) << '\n';
    std::istringstream in2{"2024-06-28 06:18:26.028Z"};
    std::cout << parseISO8601(in2) << '\n';
    std::istringstream in3{"2024-06-28T06:18:26.028+02:00"};
    std::cout << parseISO8601(in3) << '\n';
}

This outputs for me:

2024-06-28 06:18:26.028111
2024-06-28 06:18:26.028000
2024-06-28 04:18:26.028000

Note the third timestamp is 2h earlier as the input is a local_time+offset and the output is system time (UTC).

TrueWodzu commented 6 hours ago

@HowardHinnant thank you for taking the time to respond me. Much appreciated!