r-lib / usethis

Set up commonly used 📦 components
https://usethis.r-lib.org/
Other
861 stars 285 forks source link

Function to set Language field #2058

Closed llrs closed 2 weeks ago

llrs commented 1 month ago

It would be great if usethis could help developers to specify the language of their packages. This is currently only done in 9% of packages, and there are only a 3% that are in other languages. Having the tools to create them would be great According to WRE:

A ‘Language’ field can be used to indicate if the package documentation is not in English: this should be a comma-separated list of standard (not private use or grandfathered) IETF language tags as currently defined by RFC 5646 (https://www.rfc-editor.org/rfc/rfc5646, see also https://en.wikipedia.org/wiki/IETF_language_tag), i.e., use language subtags which in essence are 2-letter ISO 639-1 (https://en.wikipedia.org/wiki/ISO_639-1) or 3-letter ISO 639-3 (https://en.wikipedia.org/wiki/ISO_639-3) language codes.

Is there interest in a PR to add this?

I think it is easy to add a field checking that the language is well formatted (nchar >= 2 && lower case), but perhaps an ideal solution would be to check for the official rules (which I don't know how easy would be that).

A rough draft of the behavior would be something like:

use_language(c("ca", "es"))
ℹ Adding ca as language
ℹ Adding es as language
use_language("ca")
ℹ ca language is already present
llrs commented 1 month ago

I got some feedback asking why would one need such function while create_package already shows how to setup Language. My reasoning is that in old or new packages sometimes the development is in one language and then some other language is used after the package is created.

For example data.table is developed in English and it doesn't have the Language field. But it also have error messages translated to Spanish, Mandarin and Portuguese, and this is not show in the Description. Showing these language would help users to find it and use the translations.

jonthegeek commented 1 month ago

It would be nice to apply some simple formatting rules to at least standardize the types of things that already appear in tools::CRAN_package_db() |> dplyr::count(Language, sort = TRUE). For example, all of these real examples would ideally become "en-US":

hadley commented 3 weeks ago

This feels a bit too special purpose for usethis to me, given that it's not hard to add it manually or with desc.

llrs commented 2 weeks ago

usethis adds the field in some cases for spellcheck and thought this would be a good small extension, but maybe this functionality should be closer to translation helpers like potools.