unicode-org / icu4x

Solving i18n for client-side and resource-constrained environments.
https://icu4x.unicode.org
Other
1.35k stars 174 forks source link

Decide and implement a solution for internal-facing APIs #4467

Open sffc opened 9 months ago

sffc commented 9 months ago

We've hit multiple snags recently where we wanted to change a mostly-internal function that was pub and were blocked because of semver:

Some thoughts/comments/problems:

  1. Even if these had been public-internal functions, we still have some desire to keep them stable-ish because then we can break hard tilde dependencies between components/utils (#4343)
  2. If the functions are #[doc(hidden)], they are less discoverable including to ICU4X developers. We can't expect ICU4X developers to look into the source code just to see if a function they want happens to be there.
  3. These functions should most likely be excluded from FFI by default
  4. If we are poking holes in a util crate, the holes we poke should probably be part of the public API of that crate, because it is a utility crate. However, within ICU4X component and provider crates, it is okay and expected for us to poke holes.
  5. Previous discussion: https://github.com/unicode-org/icu4x/issues/4338#issuecomment-1846175545

Here's a proposal:

Add a macro_rules that generates the following boilerplate:

#[doc = "<div class=\"stab unstable\">"]
#[doc = "🚧 This code is internal to ICU4X; it may change at any time, in breaking or non-breaking ways,"]
#[doc = "including in SemVer minor releases. It can be enabled with the \"icu4x_internal\" Cargo feature."]
#[doc = "</div>"]
#[cfg(feature = "icu4x_internal")]

The macro_rules can live in icu_locid for now. We can move it later.

To annotate an internal API, you can do this:

/// These are docs for my internal API
icu_locid::_internal::icu4x_internal_api!()
pub struct Foo {}

The icu4x_internal Cargo feature does not bubble up to the metacrate. It can be selectively enabled whenever it is needed. One advantage of making an icu4x_internal feature is that we can disable it when uploading docs to docs.rs (https://docs.rs/about/metadata), although I don't think there's a way to disable just that one feature; we would need to switch from --all-features to an explicit feature list per-crate, which is error-prone. Alternatively, I'm fine if we say we don't introduce a Cargo feature for this and simply let these APIs live in the docs.

In the 1.5 or 2.0 timeframe, I suggest that we scrub our API surface and add this annotation where appropriate. A lot of things that are currently #[doc(hidden)] should probably use the annotation instead, and likewise things that are currently public stable should also probably use the annotation.

If we agree on this, I may suggest we make a similar annotation for the provider module.

Needs approval or input from:

sffc commented 9 months ago

Another solution I'm open to considering is to have the macro generate #[doc(hidden)], in addition to the internal warning, and then when generating internal docs use --document-private-items which includes doc(hidden) APIs.

It would make me very very happy if Rust itself had an annotation for this. Anything like "this API might not be semver safe" which generates a warning when using it cross-crate.

Manishearth commented 9 months ago

I think what we should do for these is that we have the macro generate cfg_attr(not(internal_docs), doc(hidden)) and generate our internal docs with RUSTFLAGS=--cfg internal_docs. We shouldn't use a cargo feature, we should use a cfg (since cargo features get enabled with --all-features)

Generating that doc header seems nice, if we use a macro we can tweak the precise solution over time.

sffc commented 9 months ago

@Manishearth's proposal SGTM.

Either way I would like a macro so that it's easy to find and tweak over time.

Manishearth commented 9 months ago

Concrete proposal:

LGTM to me and Shane.

sffc commented 9 months ago

Specifically, the cfg should be set on the job that creates the docs uploaded to https://unicode-org.github.io/icu4x/rustdoc/, and the "unstable feature" box should be included so that people referencing those docs don't think the function is public stable.

sffc commented 6 months ago

Open question: where should the internal docs macro live? I propose it lives in a macros file that we copy around the repo into each crate and have a CI job to ensure it remains updated, similar to LICENSE.

Needs approval:

robertbastian commented 6 months ago

Can we use symlinks? I'd rather not copy around, because then we'd have another CI job to enforce consistency, and another make job to "generate" these.

sffc commented 6 months ago

Symlinks are annoying for Windows users, but it appears there are solutions that require some manual setup.

I tend to feel that in order of priority of constituencies, the experience for new contributors should be favored over the experience of core ICU4X developers. Avoiding symlinks is harder on the core developers but easier for new contributors since they don't need any special configurations.

robertbastian commented 6 months ago

We've had a symlink in the repo for months now and I haven't heard any complaints.

robertbastian commented 6 months ago

I tend to feel that in order of priority of constituencies, the experience for new contributors should be favored over the experience of core ICU4X developers.

I also disagree with this. ICU4X core contributors spend a lot of time contributing to ICU4X. We just removed a bunch of generated files from the repo which again and again lead to failing CI and delays for core contributors.

sffc commented 6 months ago

The removal of the generated files you reference was done after establishing that doing so didn't substantially impact other constituencies.

sffc commented 6 months ago

The existing symlink impacts diplomat-gen, which impacts all contributors but not clients.

The proposed symlink would impact anyone building icu4x from the repo.

sffc commented 6 months ago

Points:

Proposal:

LGTM: @manishearth @sffc @robertbastian

Implementation proposal:

LGTM: @sffc @Manishearth

Manishearth commented 3 months ago

Cross posting from https://github.com/unicode-org/icu4x/pull/5101#issuecomment-2186931612:

My opinion on this shifts if we plan to put unsafe code in icu4x_shared: i'd rather we used a separate crate since it seems suboptimal for every ICU4X crate to suddenly contain unsafe. This adds a ton of friction to import processes involving unsafe review; so far the fact that ICU4X is a large number of crates is manageable, but if it's a large number of crates that includes a bunch of unsafe code it's worse.