Yuri6037 / time-tz

Implementation of tz database (IANA) for time Rust crate.
BSD 3-Clause "New" or "Revised" License
22 stars 7 forks source link

Factorize automatic generation from IANA database #10

Open woshilapin opened 2 years ago

woshilapin commented 2 years ago

Sorry for the not so short text below, but I thought some context was needed before exposing what I'd like to do.

Context

In the beginning of the year 2022, reacting to CVE-2020-26235, I migrated some of the repositories I was working on, from chrono to time, until I stumbled on some functionalities that needed to parse timezones. Then, I opened https://github.com/chronotope/chrono-tz/pull/93.

That said, the PR I proposed is not really satisfying:

  1. with the name chrono_tz, you would expect to have chrono in it (so making chrono an optional feature sounds dumb)
  2. it looks like the extract from the IANA Database is a functionality by itself that might live in its own repository

It's the second item that made me opened this issue (see below).

What I'd like to do?

I only found out today about your crate time-tz. I'm pretty happy about it because I'm pretty sure I'm gonna use it:smile:

I quickly parsed the code from build.rs and it seems mostly similar to chrono-tz-build. I was actually already trying to extract that part into an independent crate (when working on https://github.com/chronotope/chrono-tz/pull/93). Would you be interested in such a modification? I'd be interested to actually do the work but I'd like to know first if the idea is sound and acceptable in your project?

Yuri6037 commented 2 years ago

I'm not sure I understand what you're trying to do. In my understanding you want to avoid download and build of the entire IANA database by moving this part to it's own crate. Is that what you're trying to do?

I understand that this part can be very slow if you don't need it but currently there's no way to avoid download of the IANA database. In the current version there's already a feature which enables the IANA database (which is enabled by default).

The build.rs script is indeed very similar to chrono-tz, that's because I based my code on it except that I've included windows timezone name database (windows uses a less-precise database than IANA) and I've preferred to respect rust naming conventions.

Now if we agree on what you're trying to do that could be indeed better. However, there are several points to be aware of in the current setup:

All things considered, if we do end up extracting this to it's own crate I'd like it to be time crate agnostic, meaning it could be used with any time crate.

woshilapin commented 2 years ago

I'm not sure I understand what you're trying to do. In my understanding you want to avoid download and build of the entire IANA database by moving this part to it's own crate. Is that what you're trying to do?

I'm sorry if I was not precise enough, I'll try to answer your questions below.

I understand that this part can be very slow if you don't need it but currently there's no way to avoid download of the IANA database. In the current version there's already a feature which enables the IANA database (which is enabled by default).

I do not want to avoid downloading the IANA database. I actually want to extract the Rust code generation (in build.rs) from the IANA database... but in an agnostic way (no dependency to time or chrono).

The build.rs script is indeed very similar to chrono-tz, that's because I based my code on it except that I've included windows timezone name database (windows uses a less-precise database than IANA) and I've preferred to respect rust naming conventions.

Nice, if I understand well, what you did is actually an improvement for Windows user over the build.rs script from chrono-tz?

Now if we agree on what you're trying to do that could be indeed better. However, there are several points to be aware of in the current setup:

* You might need to move the `FixedTimeSpan` and `FixedTimeSpanSet` structs to that new crate because the build script is dependent over these (these are not dependent over the `time-rs` crate).

Thanks for the hint, that's actually one of the thing I was not sure. If both chrono-tz and time-tz use similar structures and traits, they sure should be both part of this new crate.

That said, aren't FixedTimeSpan and FixedTimeSpanSet part of the parse_zoneinfo? Is there a need to redefine them? Are they different from the one in time-tz?

* Do you plan on moving the windows name database as well?

Yes, sure. It seems that this is an improvement over what has been done in chrono-tz, so I intend to keep it.

* Depending on how we chose to handle the 2 prior points what would be the shape of the API we expose from that crate?

I didn't yet think about the details. From chrono-tz, I thought I would expose the enum Tz and probably associated traits like TimeSpans. But I parsed a bit time-tz and it seems relatively different... which is probably good. It's gonna be harder to figure out the common needs between both, but having very different solutions might help to extract the real business logic. If you have any suggestion, I'm happy to hear them.

All things considered, if we do end up extracting this to it's own crate I'd like it to be time crate agnostic, meaning it could be used with any time crate.

Yes, that's exactly the goal: to create a new crate, agnostic from chrono or time, and that can be used by both time-tz and chrono-tz in the end.

I hope it does answer some of your questions? Since I don't know yet all the implementation details, I might still miss things and be not precise enough, but I hope to figure out the details as I do it.

Yuri6037 commented 2 years ago

I see, I think separating IANA database generation to a different crate could indeed be better in the long term: that would avoid me having to periodically check for updates with the official IANA database.

Nice, if I understand well, what you did is actually an improvement for Windows user over the build.rs script from chrono-tz?

Indeed my build.rs includes support to convert the system timezone name which is given by WinAPI functions into an IANA timezone. Support for reading system timezone is available with the system feature of this crate and requires this windows timezone name database (actually downloaded from the unicode project).

I've thought of a possible API which could leave the FixedTimeSpan and FixedTimeSpanSet in this crate with a build script only API:

Such API would allow moving the tz submodule, the win_cldr_data and the update_repo.sh script to that new crate. This would also greatly simplify the build.rs script.

I also think we should create a new repository for this library.

What do you think?