nodejs / node

Node.js JavaScript runtime βœ¨πŸ’πŸš€βœ¨
https://nodejs.org
Other
106.56k stars 29.05k forks source link

doc: advices to use full-icu package is outdated #36057

Closed Mumeii closed 2 months ago

Mumeii commented 3 years ago

πŸ“— API Reference Docs Problem

Location

Affected URL(s):

Description

Since the first version delivered with full-icu by default, Node 13.x.x, the documentation keep advising to use full-icu package if we want to keep up to date with the latest i18n data.

But unfortunately, this package only fetch ICU data files for node versions built with small-icu support ...

See the corresponding piece of code here

Thus, it is miss leading the reader to be advised to use a tool which is in mutual exclusion with the system that is released by default.

Obviously, not all users of Node 13 and above will go for the default full-icu one, especially if they have memory footprint constraints.

But I guess that with the whole [ micro services / containerization / cloud ] hype those days, most peoples don't care much to finely tune there images and go for the official ones on Docker hub ... which themselves comes in a single flavor each, that is, the full one since Node 13 🍭


mhdawson commented 3 years ago

@Mumeii fixing that up sounds good. Any chance you want to submit a pull request ?

Mumeii commented 3 years ago

@mhdawson : Hi !

I wouldn't mind submitting a PR if I wasn't still puzzled :

In fact, I can't tell :

From our development team point of view, it seems it would make sense if it was also working on full-icu release.

Picture the trouble we're facing right now due to that restriction :


We've also thought about another work around, that would necessitate to :

But :


Another "work around" would be to be able to base our CI on some kind of node:14-alpine-smallIcu tag, in order to have the full-icu package doing its job.

But can't tell if that solution is viable from a docker image management point of view, and if more than an handful of projects would benefits having such images available anyway.

Do you have stats πŸ“‰ πŸ“ˆ somewhere that would help telling which is the usage ration between both flavor ?


I know the whole thing could seems to be a bit "too much", but we're awaiting a CLDR fix for a Latvian date formatting trouble, and we'd like to rely on our CI deliveries to insure of the fix delivery to our customers rather than human action

richardlau commented 3 years ago

cc @nodejs/i18n-api

srl295 commented 3 years ago

full-icu is still helpful IF you manually build with small icu. I revamped the main READMEs (such as BUILDING.md) and embedded docs when full-by-default landed in https://github.com/nodejs/node/pull/29522 , but obviously these docs you cited did not follow suit.

srl295 commented 3 years ago

the documentation keep advising to use full-icu package if we want to keep up to date with the latest i18n data.

Not exactly. It should be advising you to use full-icu if you want to have i18n data, if your node build didn't originally have i18n data (in the small icu case). It should not be understood that full-icu is some kind of update mechanism to produce data that's any more up to date than any other method. If it says that, it's always been wrong and should be changed.

But unfortunately, this package only fetch ICU data files for node versions built with small-icu support ...

it could fetch data files for all versions.

srl295 commented 3 years ago

why functionally speaking this tool don't want to load ICU data files for Node releases compiled with full-icu

because if it is compiled with full-icu there's nothing else to load. As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism. Updating the data is a more complex problem which is not addressed at all by the current tooling, please discuss it with the ICU TC ( icu-project.org ). The best way to get the latest ICU data is to upgrade to the latest ICU.

mhdawson commented 3 years ago

@srl295 thanks for jumping in with the clarifications.

srl295 commented 3 years ago

@Mumeii soorry I did not analyze your whole situation as much before.

But from what we see, we can't upgrade to a fixed node:14.x.y-alpine image version without loosing the always up to date ICU data effect because of the issue we're reporting here So we're staying on a 12.x.x version and asking around what is the best thing to do πŸ˜‹

To be clear, This is a misunderstanding. You are not up to date by staying on 12. Quite the opposite. So upgrade to 14 and dont worry ablut full icu anymore.

Mumeii commented 3 years ago

@srl295 : Hi !

@Mumeii sorry I did not analyze your whole situation as much before.

No troubles. Thanks for your feedbacks and the clarifications πŸ™

If I understand you the right way, it mean that it's the documentation that must be updated.

Thus, can you help me one last time, in order for me to make a sound documentation PR ?

Am I wrong saying that :


As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism

Mean that :

1) Before Node 14 became the lts version, a CI relying on both a node:lts-alpine and full-icu package usage was simply constantly downloading the same ICU data set over and over again at each builds, no matter which newer ICU versions could be available.

2) It was the only possible option in order to have both a full ICU data set and a CI based on an official node Docker image. At least as long as lts-alpine tag was pointing to a, small-icu based, 12.x.y version

3) Due to the complexity of the ICU data integration, it's not possible to design a one size fit for all import mechanism. So the job is done once for each Node release, and the only compatible ICU version is the one referenced by process.versions.icu at runtime

So upgrade to 14 and dont worry about full icu anymore

Mean that :

4) The constraint I refer to in 3. is related to the a whole Node major version, not a single major.minor.patch version. Which mean that the binding to the latest version of ICU data available is "casted into stone" during the build of a Node JS major version release candidate ?

The best way to get the latest ICU data is to upgrade to the latest ICU

Mean that :

5) the only way to achieve what we think we were doing on our CI would be to fork the Node JS code base and to manage ourself the binding with ICU data after each of their version release ?

Once again :

srl295 commented 3 years ago

@srl295 : Hi !

@Mumeii sorry I did not analyze your whole situation as much before.

No troubles. Thanks for your feedbacks and the clarifications πŸ™

:+1:

If I understand you the right way, it mean that it's the documentation that must be updated.

Yes.

Thus, can you help me one last time, in order for me to make a sound documentation PR ?

Am I wrong saying that :

As I mentioned above, full-icu is not an update mechanism, it's a download size reduction mechanism

Mean that :

  1. Before Node 14 became the lts version, a CI relying on both a node:lts-alpine and full-icu package usage was simply constantly downloading the same ICU data set over and over again at each builds, no matter which newer ICU versions could be available.

Correct.

In fact, the way full-icu was currently implemented, if you have ICU 63.1 it would not properly fetch ICU 63.2 data (see closed issue https://github.com/unicode-org/full-icu-npm/issues/35 because that specific issue was moot).

  1. It was the only possible option in order to have both a full ICU data set and a CI based on an official node Docker image. At least as long as lts-alpine tag was pointing to a, small-icu based, 12.x.y version

This is not correct, full-icu is definitely not the only possible option. The docs you point to give a number of other options. full-icu was intended to be a convenience for users, with a short list of steps, rather than having to " go find file X to download", "install file X here", etc. It worked in a lot of cases, but not all of them.

So there are other ways to do what full-icu does, but it was intended to be the simplest one.

  1. Due to the complexity of the ICU data integration, it's not possible to design a one size fit for all import mechanism. So the job is done once for each Node release, and the only compatible ICU version is the one referenced by process.versions.icu at runtime

Correct. There is no compatibility at build time for any other major version.

So upgrade to 14 and dont worry about full icu anymore

Mean that :

  1. The constraint I refer to in 3. is related to the a whole Node major version, not a single major.minor.patch version. Which mean that the binding to the latest version of ICU data available is "casted into stone" during the build of a Node JS major version release candidate ?

ICU almost never releases even minor version changes of the data. (There is a whole separate process around updating time zone data but I will exclude that for now.) I can not remember a case where. would recommend a data update separate from the code update. So I don't think the scenario 'get the latest official ICU data' really happens in practice, besides upgrading from, say, ICU 67 to ICU 68.

The best way to get the latest ICU data is to upgrade to the latest ICU

Mean that :

  1. the only way to achieve what we think we were doing on our CI would be to fork the Node JS code base and to manage ourself the binding with ICU data after each of their version release ?

No. It means:

  1. use a newer Node version (a major or minor update) which itself includes an ICU update, or,
  2. use a compile-time option in Node to choose a later ICU., such as configure … --with-icu-source=<URL to later ICU>

Once again :

  • thank you for your reply, your help is greatly appreciated.
  • sorry for asking question that may seems obvious to you, but if I want to propose a PR on the documentation, I'd rather be certain to fully understand the problematic 😝

You are very welcome. As of mid- year this is suddenly not part of my paid job at all, so it would be great to have others help with updating the documentation!

Mumeii commented 3 years ago

@srl295 :

use a newer Node version (a major or minor update) which itself includes an ICU update

Ok, so from a Node M.m.p to the next M.m+1.0 release, the version of process.versions.icu can vary ?

In that case I guess the simplest way having our project staying up to date with the i18n data would be to move from a, let's say, node:14.x.y-alpine to a node:14-alpine

Good to know.


About the 2., it was a cumbersome way for me to mention that :

It always feel like a drawback to me when I can't get this lts seal of quality straight out of Docker hub; or said another way, I always feel like I'm going to mess up something important when having to perform those last building steps myself.

That's why I feel stuck between not been eager to deal with a compilation, and still wanting to be able to choose the level of i18n data pre-package integration.

Having those extra versions built and released as Docker hub official image would be great (at least for me πŸ‘…), but I guess that it would be hell of a mess to build, to tag, and to manage on the long run.

sgallagher commented 3 years ago

@Mumeii What we do in Fedora, we always build with small-icu, but we also pass --with-icu-default-data-dir=<...> to the configure. The result is that we compile with the capability to carry the full ICU package (which we provide as a subpackage called nodejs-full-i18n) by just dropping the other language data into a well-known path.

IMHO, the ideal case would be for the official Node container images to follow this same pattern; build with small-icu and provide a simple mechanism (such as a bash script) to the data directory.

srl295 commented 3 years ago

@sgallagher (hi!) yes, either use the full icu in the image, or do as you said.

RedYetiDev commented 2 months ago

The version specified in this issue has reached EoL. If this is no longer relevent, feel free to self-close.

mhdawson commented 2 months ago

@RedYetiDev was this changed in later versions of the doc as it says version 13 and above. If you checked an it's fixed in recent versions then I agree it can be closed, otherwise it would still apply.

RedYetiDev commented 2 months ago

It looks fixed to me, but given the conversation around it, I wanted confirmation before closing the issue

srl295 commented 2 months ago

If current docs are fine then close it. People DO still build and use small-icu with custom builds.