conda-forge / libcxx-feedstock

A conda-smithy repository for libcxx.
BSD 3-Clause "New" or "Revised" License
1 stars 19 forks source link

[rc] install C++20 modules; enable tzdb integration #178

Closed h-vetinari closed 1 month ago

h-vetinari commented 2 months ago

This addresses most of #142 (for the rc branch), by installing C++20 modules (enabled by default since https://github.com/llvm/llvm-project/commit/19d2d3fe50c301272350d12c53c801b17e29e64e) as well as pointing the experimental chrono implementation to use our own tzdata, following very closely along the lines already discussed in detail on the libstdcxx side

conda-forge-webservices[bot] commented 2 months ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

h-vetinari commented 2 months ago

@isuruf, you mentioned in https://github.com/conda-forge/ctng-compilers-feedstock/pull/142 that patching in the relative paths approach for the tzdb in a static library would be a security hole (presumably because the symbol could be used in a way to load an arbitrary payload), but we have a problem here: libcxx 19 only has experimental tzdb-support, which ships in the static libc++experimental.a (as the patch shows I was barking up the wrong tree here so far).

Given that libcxx 19 has no native support for <chrono> on osx (and I believe this is an important usecase, not least because it's the last C++ stdlib to haven't completed support and projects are starting to use it), could you live with the environment variable approach here so that we at least have some way to use the tzdb bits on osx? Once this moves into the shared library, I'd be happy to switch to relative paths.

h-vetinari commented 2 months ago

Hm, though I guess anything in a static library that needs to do some sort of dynamic look-up can be exploited, also the env var approach. I guess we could do something like GCC does with _GLIBCXX_STATIC_TZDATA, i.e. embedding the tzdb into the static lib?

isuruf commented 2 months ago

In the gcc PR, we used the system tzdb for static library. We should do the same.

h-vetinari commented 2 months ago

Not sure I understand. The static variant of zoneinfo_dir_override does

 #ifdef _GLIBCXX_ZONEINFO_DIR
     return _GLIBCXX_ZONEINFO_DIR;
 #else
     return nullptr;
 #endif

which is defined as /usr/share/zoneinfo on linux, but undefined on darwin (libcxx comes to the same conclusion). So what system tzdb do you mean?

isuruf commented 2 months ago

/usr/share/zoneinfo works just fine on macos (At least on macOS 14)

h-vetinari commented 2 months ago

/usr/share/zoneinfo works just fine on macos (At least on macOS 14)

does it contain tzdata.zi and leapseconds, or just the various timezone folders?

Looking closer, it appears that libcxx intentionally bases itself on another style of leapseconds file, which we don't have in our tzdata. 😑

h-vetinari commented 2 months ago

Given the missing leap-seconds.list almost everywhere (and generally unclear support situation on older macs), can we go back to the idea about embedding tzdata.zi & leap-seconds.list statically?

That would keep the tzdb basically as up to date as explicitly depending on tzdata (because we release libcxx builds regularly, but tzdata barely ever updates), and the footprint is small (~110kb, less than 10% of the current size). If size is an issue, we could also put libc++experimental.a into a separate output (which might be a good idea anyway).

isuruf commented 2 months ago

I don't think embedding those are good. This is akin to linking statically and we discourage that. I'd rather we wait until the tzdb integration matures enough to move out of libc++experimental.a.

h-vetinari commented 2 months ago

I'd rather we wait until the tzdb integration matures enough to move out of libc++experimental.a.

That condemns everyone in conda-forge to wait likely at least a year just to use C++20 <chrono> (which is available on all other platforms) for what comes down to a matter of taste. Yes, we discourage static linking, but cfep-18 has exceptions for a reason. We'll move to the same approach as for libstdcxx as soon as it's in the shared lib, but until then, this case is easily important enough for an exception.

isuruf commented 2 months ago

That condemns everyone in conda-forge to wait likely at least a year just to use C++20

I haven't heard from anyone about this. Are there any use-cases?

isuruf commented 2 months ago

What does other package managers on macOS do?

h-vetinari commented 2 months ago

There's at least two feedstocks using howardhinnant_date as a replacement because they cannot use C++20 <chrono> (Howard authored P0355 upon which it is based, and his implementation functions like a backport). I know that sparrow originally wanted to use C++20, because I got pinged about chrono support in conda-forge.

Looking around, onnxruntime similarly used libdate (which is unfortunately identical to howardhinnant_date...), but now seems to require C++20 already based on the upstream issue tracker. There's also r-rcppdate which is yet another implementation of Howards code, adapted for R.

Other projects I'm aware of:

C++20 support is really on the uptake though (as implementations are finally starting to stabilize in gcc/clang), so I expect this to become even more relevant over the next 12 month, especially as libcxx is the last major implementation to ship tzdb-support (and since date-handling is so central to many use cases yet so painful to hand-roll, people have a big incentive to switch).

h-vetinari commented 2 months ago

What does other package managers on macOS do?

I'm assuming newer apple versions will ship what they need in /usr/share/zoneinfo, since the libcxx BDFL works there. For other distribution, I don't know. I'm assuming many will just override __libcpp_tzdb_directory, which is why it's a weak symbol after all.

h-vetinari commented 2 months ago

I don't have access to a mac and /usr/share/zoneinfo is not part of the SDKs, but I guess if all the required files are there on a macOS 14 agent (including leap-seconds.list), then we could just document that the solution is to use vmImage: macos-14. That would at least be a way to make it work partially without impacting the deployment target.

h-vetinari commented 2 months ago

OK, so ubuntu-22.04 doesn't contain leap-seconds.list, but I checked that ubuntu-24.04 does (however, using the beta images didn't work in azure for now). macos-14 also doesn't contain leap-seconds.list, but does contain a working tzdata.zi. So I guess the lowest non-trivial support to have would be "timezones only". Pragmatically speaking, those are much more important than leapseconds, so I guess that's a start.

h-vetinari commented 2 months ago

OK, I've skipped the leapsecond portion, but now we fail in the override test; _LIBCPP_WEAK should be set, and the libcxx test suite does the override in pretty much the same way.

h-vetinari commented 2 months ago

OK, so libcxx doesn't seem to want external users to override that weak symbol. See here:

// This function is marked as "overridable" in libc++ only for the test
// suite. Therefore the declaration is not in <chrono>.

So I think we should drop that test. Unless you want to patch libcxx to make their symbol externally overridable?

h-vetinari commented 2 months ago

So I think we should drop that test. Unless you want to patch libcxx to make their symbol externally overridable?

I did that now; the rest passes. PTAL

h-vetinari commented 1 month ago

Going with this approach for now. We're still only in the rc-branch, so we'll have time to refine before going to main.